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1.  INTRODUCTION: 


Islets  are  composed  of  >5  endocrine  cell  types  that  perform  complementary  functions  to  maintain  proper 
glucose  homeostasis.  This  cellular  heterogeneity  impedes  our  ability  to  understand  the  precise  transcriptional 
repertoire  and  regulatory  landscape  of  each  cell  type  and  to  determine  how  these  programs  in  each  cell  type  are 
perturbed  in  type  2  diabetes  (T2D).  The  overarching  goal  of  this  project  is  to  determine,  with  single  cell 
resolution,  changes  in  cellular  composition  and  cell-specific  gene  expression  programs  elicited  by  T2D  in 
human  islets  using  innovative  single  cell  transcriptomic  (scRNA-seq;  Aim  1)  and  epigenomic  (scATAC-seq; 
Aim  2)  technologies. 


2.  KEYWORDS: 

Single  cell;  epigenome;  scATAC-seq;  scRNA-seq;  transcriptome;  human  islet;  alpha;  beta;  delta;  pancreatic 
polypeptide  (PP);  gamma;  epsilon;  endocrine 


3.  ACCOMPLISHMENTS: 

What  were  the  major  goals  of  the  project? 

Major  goals  of  the  project: 

Aim  1:  Islet  single  cell  transcriptomes 

la:  Non-diabetic  (ND)  islet  single  cell  transcriptomes 

Milestone  (12  months):  -1000  single  cell  transcriptome  profiles  from  5  ND  islets 

Achieved:  4806  single  cell  transcriptome  profiles  from  1  ND  islet  (-500%  of  cells,  20%  of  samples  to  date) 

lb:  Type  2  diabetic  (T2D)  islet  single  cell  transcriptomes 

Milestone  (12  months):  -1000  single  cell  transcriptomes  from  5  T2D  islets 

Achieved:  -4192  single  cell  transcriptome  profiles  from  1  T2D  islet  (-400%  of  cells,  20%  of  samples  to  date) 
lc:  Determine  cell  type  transcriptome  signatures  in  ND  and  T2D  samples 

Milestone  (anticipated,  month  15):  Comprehensive  analysis  of  islet  transcriptomes  and  identification  of  cell 
type-specific  transcriptomes  /  “signature”  genes 

Achieved:  Identification  of  cell  type-specific  transcriptomes  and  signature  genes  from  two  samples 
Id:  Identify  cell  type-specific  expression  differences  between  ND  and  T2D  samples 

Milestone  (anticipated,  month  15):  Identification  of  cell  type-specific  differential  expression  in  T2D  vs.  ND 
samples 

Aim  2:  Islet  single  cell  epigenomes 

2a:  Non-cliabetic  (ND)  islet  single  cell  epigenomes 

Milestone  (12  months):  -1000  single  cell  epigenome  (scATAC-seq)  profiles 
Achieved:  -400  single  cell  epigenome  profiles  (40%) 

2b:  Type  2  diabetic  (T2D)  islet  single  cell  epigenomes 

Milestone  (12  months):  -1000  single  cell  epigenome  (scATAC-seq)  profiles 

Achieved:  -400  single  cell  epigenome  profiles  (40%) 

2c:  Determine  cell  type  epigenome  signatures  in  ND  and  T2D  samples 

Milestone  (anticipated,  15  months):  Comprehensive  analysis  of  islet  epigenomes  and  identification  of  cell  type- 
specific  regulatory  element  use/epigenome  signatures 
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Achieved  to  date:  Established  and  evaluated  scATAC-seq  analysis  pipelines;  QC  and  unsupervised  clustering 
analyses  of  1  ND  islet 


2d:  Identify  cell  type-specific  epigenomic  differences  between  ND  and  T2D  samples 

Milestone  (anticipated.  15  months):  Identification  of  cell  type-specific  differences  in  regulatory  element 
use/epigenome  signatures  in  T2D  and  ND  states 

Achieved  to  date:  Identification  of  aggregate  single  cell  ATAC-seq  regulatory  element  signatures  from  1  ND 
islet 


What  was  accomplished  under  these  goals? 

1)  Major  activities'.  We  have  completed 
single  cell  transcriptome  and  epigenome 
profiling  of  1  T2D  and  1  matched  ND 
islet  to  date.  We  have  effectively 
implemented  analysis  pipelines  for  both 
islet  single  cell  transcriptome  and 
epigenome  analyses. 


2)  Specific  objectives'.  Specific  objectives 
in  this  reporting  period  were  (1)  to 
establish  and  test  analytical  pipelines  for 
single  cell  transcriptome  and  epigenome 
analysis  and  (2)  complete  single  cell 
profiling  of  pancreatic  islets  from  5  non¬ 
diabetic  and  5  type  2  diabetic  (T2D) 
donors. 
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3)  Significant  results: 


We  have 
transcriptome 


Figure  1.  Single  cell  transcriptomes  of  ND  islet  cluster  into  major  cell 
types  with  little  evidence  for  subpopulations.  (Left):  t-SNE  dimensional 
reduction  analysis  of  single  cell  transcriptome  similarity.  Cells  clustered 
into  major  endocrine  and  exocrine  cell  types  as  labeled.  (Right):  Violin 
plots  of  marker  gene  expression  for  delta  ( SST ),  gamma  ( PPY ),  beta  (INS), 
and  alpha  ( GCG )  cells.  Note:  color  codes  in  right  panel  do  not  correlate  to 
those  for  cell  clusters  on  the  left. 


completed  single  cell 
profiling  of  4806  cells  from  one  ND  islet 
and  4192  cells  from  one  T2D  islet  using 
the  10X  Genomics  platform.  As  shown 

in  Figures  1  and  2,  we  have  identified  all  major  endocrine  cell  populations  in  both  ND  and  T2D  samples. 
Importantly,  the  ability  to  profile  thousands  of  single  cell  transcriptomes  per  donor  for  the  same  cost  as  the 
hundreds  of  cells  per  donor  originally  proposed  is  allowing  us  to  determine  more  rigorously  (1)  if  there  are 
subpopulations  of  endocrine  cells  in  ND  And  T2D  islets,  (2)  if  new,  rarer  populations  of  cells  such  as  “de¬ 
differentiated  beta  cells”  are  appearing  in  T2D  samples,  and  (3)  if  the  relative  known  and  novel  cell 
(sub)populations  are  changing  in  proportion  between  ND  and  T2D  islets.  Analyses  of  our  single  cell 
transcriptome  data  are  limited  to  n=l  each  for  ND  and  T2D  states,  so  we  refrain  thus  far  from  making  broad 
conclusions.  However,  our  results  so  far  suggest  the  following: 

•  Endocrine  cells  in  ND  islets  cluster  by  cell  type  (alpha,  beta,  delta,  gamma/PP).  We  neither  find 
evidence  of  endocrine  cell  type  subpopulations  nor  of  novel  populations  in  this  ND  islet  (Fig.  1). 

•  T2D  islets  contain  discernible  alpha  and  beta  cell  subpopulations  (Fig.  2A,  Beta  1,2  and  Alphas  1-3). 
We  are  still  in  the  process  of  analyzing  these  data,  but  our  preliminary  differential  analyses  of  Beta  1 
and  Beta  2  subpopulations  (Fig.  2A-D)  indicate  that  Beta  2  expresses  similar  levels  of  INS  as  Beta  1 
(Fig.  2B).  However,  Beta  2  exhibit  significantly  lower  MAFA  (Fig.  2C),  NKX6-1  (Fig.  2D),  and  PDX1, 
MAFB,  and  NEUROD1  (not  shown)  expression,  consistent  with  previous  targeted  observations  by 
Roland  Stein’s  group1.  Moreover,  we  detect  significant  induction  of  genes  modulating  stress  responses 
such  as  amyloid  responses,  processing,  and  degradation  (. APLP2 ,  APP,  ITMB2)  inflammation  (CTSA, 
RBP4),  autophagy  (TMEM59,  LAMP1,  LAMP2,  ATP6AP2),  and  oxidative  and  endoplasmic  reticulum 
stress  responses  ( PRDX4 ,  ERP29 )  in  this  same  population.  Finally,  we  identified  differential  regulation 
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of  T2D  GW  AS  effector 
genes  ZFAND6, 

CDKN2A,  and  KCNK17 
between  these  two  beta 
cell  populations.  We  are 
continuing  these 

analyses  through  the 
next  reporting  period 
and  will  extend  them  to 
identify  the  genes  and 
pathways  distinguishing 
the  three  alpha  cell 
subpopulations.  Most 
importantly,  we  will 
complete  these  analyses 
in  additional  islet 
samples  to  identify 
robust  and  reproducible 
T2D  cell  type-specific 
signatures  among 
multiple  individuals. 

Epigenome  profiling  of  single 
cells  using  ATAC-seq  to 
identify  open  chromatin  sites 
has  been  completed  for  one 
T2D  and  one  ND  individual. 
As  shown  in  Fig.  3,  aggregate 
scATAC-seq  profiles  from  165 
single  cells  identify  many  of 
the  consensus  sites  identified  in 
bulk  islet  samples  (“Islet 
ATAC-seq”) 
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Figure  2.  Single  cell  transcriptomes  reveal  putative  alpha  and  beta  cell 
subpopulations  in  T2D  islets.  (A)  t-SNE  analysis  of  4192  single  cell  transcriptomes 
from  one  T2D  islet  donor.  (B-D)  Beta  2  subpopulation  exhibits  similar  high  levels  of 
INS  expression  (B,  pink/red  dots)  but  significantly  lower  MAFA  (C)  and  NKX6-1  (D) 
expression  than  Beta  1 . 
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(gray  boxes). 

Importantly, 
these  include 
islet- specific 
sites  (compare 
GM12878 
ATAC-seq  to 
“Islet  ATAC- 
seq”  and 

“Aggregate 
scATAC-seq” 
plots)  at  both 
transcription 

start  sites  and  distal  regulatory  elements,  suggesting  our  profiling  approach  is  capturing  the  full  range  of 
regulatory  element  classes  (e.g.,  promoter,  enhancer,  insulator)  in  islets.  The  sparse  nature  of  the  scATAC-seq 
profiles  in  each  specific  single  cell  has  made  assigning  each  single  cell  epigenome  profile  to  a  specific  cell  type 
(e.g.,  alpha,  beta,  delta,  PP/gamma)  challenging.  We  believe  part  of  this  difficulty  is  due  to  technical  issues  of 
over-  or  under-transposition  of  samples.  To  address  this  technical  challenge,  we  are  completing  serial  dilution 
experiments  to  determine  optimal  transposase  concentrations  that  will  yield  robust  scATAC-seq  profiles  for 
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Figure  3.  Aggregation  of  single  cell  ATAC-seq  profiles  largely  recapitulates  the  epigenomic  landscape 
identified  in  bulk  islet  samples.  UCSC  Genome  Browser  view  of  the  KCNJ11/ABCC8  genes  encoding  the 
sulfonylurea  receptor  subunits.  Gray  boxes  highlight  open  chromatin  sites  consistently  identified  by  bulk  islet 
and  aggregated  single  cell  (n=165  cells)  ATAC-seq  profiles.  Several  of  these  sites  are  islet-specific  and  not 
observed  in  a  lymphoblastoid  cell  line  (GM12878).  Gene  models  (RefSeq  Genes)  are  shown  at  the  bottom. 


each  single  cell.  We  do  not  anticipate  that  this  is  an  insurmountable  technical  hurdle.  Pending  the  results  of 
these  experiments,  however,  we  may  explore  an  alternative  approach  to  transpose  multiple  pools  of  tens  to 
hundreds  of  FACS-enriched  alpha,  beta,  delta,  and  gamma  cells  each  to  generate  more  robust  cell  type-specific 
epigenome  profiles  from  ND  and  T2D  islet  donors.  This  alternative  approach  would  only  be  implemented  after 
consultation  with  and  approval  by  Dr.  Thakar,  Program  Officer  on  this  Discovery  Award,  as  it  would  represent 
a  change  in  the  scope  of  the  project,  namely  changing  the  epigenome  profiling  assay  resolution  of  from  single 
cell  to  pools  of  a  specific  enriched  cell  type. 

Stated  goals  not  met:  In  this  reporting  period,  we  have  experienced  challenges  in  obtaining  the  number  of  islets 
originally  proposed.  This  was  due  to  less  frequent  availability  of  T2D  islets  (5  offers  total  between  August  2016 
and  May  2017)  than  anticipated  based  on  frequency  in  the  previous  two  years  of  tracking  (~1  offer  per  month). 
Moreover,  for  3  T2D  offers,  the  islet  preparation  at  ProdoLabs  or  IIDP  failed  to  yield  high  quality  islets.  To 
address  this  challenge  in  obtaining  T2D  islets,  we  discussed  and  implemented  an  alternative  islet  subscription 
strategy  with  IIDP  to  maximize  our  chances  to  obtain  T2D  islets.  Since  its  implementation  in  May,  we  have 
received  3  targeted  offers,  for  which  one  donor  met  our  inclusion  criteria.  We  have  identified  and  profiled  an 
appropriately  matched  non-diabetic  donor.  As  described  in  Major  activities  and  Significant  results  above,  our 
experimental  and  analytical  pipelines  are  robust  and  generating  high  quality  profiles,  so  we  are  equipped  and 
prepared  to  complete  expedited  analyses  of  the  islet  datasets  in  months  13-18  .  Given  the  current  rate  of 
obtaining  islets,  we  anticipate  we  can  obtain  islets  from  3-4  T2D  donors  from  the  same  number  of  matched  ND 
donors  and  analyze  them  within  the  upcoming  5-6  months.  If  we  experience  another  lag  in  T2D  islet  availability 
in  the  upcoming  1-2  months,  we  will  seek  alternative  sources  of  islets,  including  the  University  of  Alberta 
center  directed  by  Dr.  Patrick  MacDonald. 

What  opportunities  for  training  and  professional  development  has  the  project  provided? 


Nothing  to  Report 


How  were  the  results  disseminated  to  communities  of  interest? 


Nothing  to  Report 


What  do  you  plan  to  do  during  the  next  reporting  period  to  accomplish  the  goals? 

To  deal  with  the  lag  in  obtaining  T2D  islets  due  to  lower  than  anticipated  availability  of  T2D  islets  and  isolation 
failures  at  the  IIDP/ProdoLab  centers,  (noted  in  3.3  above),  we  have  specifically  discussed  and  implemented  a 
strategy  with  the  IIDP  coordinating  team  to  maximize  targeted  offers  for  T2D  islets  and  match  non-diabetic 
donor  islets  using  the  Open  offers.  These  cells  are  still  acquired  under  the  same  protocols  used  by  the 
IIDP/ProdoLabs  and  approved  by  OHRP.  After  implementing  this  change,  we  have  obtained  a  well-matched 
T2D-ND  set  of  donors  in  the  span  of  1  month.  Moreover,  our  adaptation  of  the  10X  Genomics  platform  for 
single  cell  RNA-sequencing  allows  us  to  sample  -4000-6000  single  cell  transcriptomes  per  sample,  which  will 
maximize  our  ability  to  detect  cellular  subpopulations  and  determine  quantitative  differences  between  these  cell 
types/subpopulations  within  a  given  sample  (as  shown  in  Fig.  2)  and  between  T2D  and  ND  samples. 

lc:  Determine  cell  type  transcriptome  signatures  in  ND  and  T2D  samples 

Milestone  (anticipated,  month  15):  Comprehensive  analysis  of  islet  transcriptomes  and  identification  of  cell 
type-specific  transcriptomes  /  “signature”  genes 

Plans  moving  forward:  We  believe  that  cell  type-specific  transcriptomes  will  not  differ  substantially  between 
ND  donors,  and  therefore  anticipate  that  we  have  already  identified  the  cell  type-specific  transcriptomes  of 
healthy  islets.  However,  based  on  our  initial  T2D  islet  results,  we  could  conceivably  identify  different 
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subpopulations  among  different  T2D  donors.  With  the  updated  islet  procurement  strategy,  we  anticipate  to  be 
able  to  obtain  and  sequence  the  remaining  four  T2D  donor  islets  by  month  15-16.  Analysis  pipelines  are  in 
place  and  robust,  so  we  expect  that  analyzing  the  additional  data  should  take  no  longer  than  2-3  weeks,  which 
should  keep  us  within  our  original  plans  to  summarize  and  write  up  study  results  by  month  18. 

Id:  Identify  cell  type-specific  expression  differences  between  ND  and  T2D  samples 

Milestone  (anticipated,  month  15):  Identification  of  cell  type-specific  differential  expression  in  T2D  vs.  ND 
samples 

Plans  moving  forward:  As  noted  above,  we  anticipate  obtaining  and  profiling  all  needed  T2D  and  ND  samples 
by  months  16  and  17,  respectively.  Differential  analyses  between  specific  cell  types  (e.g.  T2D  delta  vs.  ND 
delta)  should  take  approximately  1-2  weeks  additional  time. 

2c:  Determine  cell  type  epigenome  signatures  in  ND  and  T2D  samples 

Milestone  (anticipated,  15  months):  Comprehensive  analysis  of  islet  epigenomes  and  identification  of  cell  type- 
specific  regulatory  element  use/epigenome  signatures 

Plans  moving  forward:  As  noted  in  Significant  Results,  aggregate  scATAC-seq  profiles  reflect  those  of  bulk 
islets,  but  the  sparse  nature  of  these  epigenomic  datasets  have  made  assigning  each  single  cell  ATAC-seq 
profile  to  a  specific  cell  type  challenging.  We  have  transposed  two  additional  samples  and  will  determine  if 
changing  transposase  concentration  results  in  more  robust  scATAC-seq  profiles  in  the  next  month.  If  it  does 
not,  we  will  contact  Dr.  Thakar  and  discuss  an  alternative  approach,  namely  ATAC-seq  of  each  endocrine  cell 
type  enriched  by  fluorescence-activated  cell  sorting  (FACS)  to  define  alpha,  beta,  delta,  and  PP/gamma  cell 
type  epigenomes  in  ND  and  T2D  islets.  As  these  samples  will  be  processed  in  parallel  with  the  transcriptome 
profiling  {Aim  Id),  we  anticipate  the  same  timeline  to  reach  completion. 

2d:  Identify  cell  type-specific  epigenomic  differences  between  ND  and  T2D  samples 

Milestone  (anticipated.  15  months):  Identification  of  cell  type-specific  differences  in  regulatory  element 
use/epigenome  signatures  in  T2D  and  ND  states 

Plans  moving  forward:  The  specific  strategy  to  meet  a  timeline  of  completed  data  collection  and  analyses  of 
-16-17  months  is  detailed  for  Aim  2c,  above.  We  have  completed  differential  analyses  of  whole  islet  ATAC-seq 
profiles  from  ND  and  T2D  donors,  thus  the  pipeline  is  in  place  and  analysis  should  <  2  weeks  longer  than  the 
time  needed  to  profile  and  sequence  the  samples. 


4.  IMPACT:  Describe  distinctive  contributions,  major  accomplishments,  innovations,  successes,  or  any 
change  in  practice  or  behavior  that  has  come  about  as  a  result  of  the  project  relative  to: 

What  was  the  impact  on  the  development  of  the  principal  discipline(s)  of  the  project? 

Nothing  to  Report 


What  was  the  impact  on  other  disciplines? 

Our  single  cell  studies  and  datasets  led  to  some  collaborative  discussions  with  Dr.  Zhijin  Wu  at  Brown 
University.  She  is  developing  novel  approaches  to  analyze  single  cell  data  to  reveal  new  and  unique  insights 
into  the  dynamics  and  mechanisms  of  gene  expression  at  the  single  cell  level.  The  results  of  this  collaboration 
have  yielded  co-authorship  on  a  manuscript  describing  their  methodology,  which  is  in  preparation  for 
submission  to  Bioinformatics. 
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What  was  the  impact  on  technology  transfer? 


Nothing  to  Report 


What  was  the  impact  on  society  beyond  science  and  technology? 

Data  from  this  study  were  shared  in  the  Community  Health  Discussion  series,  a  community  outreach  initiative 
involving  The  Jackson  Laboratory  and  The  Children’s  Museum  in  West  Hartford.  In  this  educational 
presentation  for  the  general  public,  I  presented  initial  results  from  our  single  cell  islet  transcriptome  analyses 
and  explained  how  they  can  reshape  our  understanding  of  pathogenic  events/processes  contributing  to  diabetes 
and  how  they  may  shift  our  approach  to  preventing  and  treating  diabetes. 


5.  CHANGES/PROBLEMS: 

Changes  in  approach  and  reasons  for  change 


Nothing  to  Report 


Actual  or  anticipated  problems  or  delays  and  actions  or  plans  to  resolve  them 

As  noted  above,  we  encountered  unforeseen  delays  obtaining  T2D  islets  due  to  lower  than  usual  availability  and 
unproductive  islet  yields  in  ProdoLabs/IIDP  islet  distribution  centers.  In  consultation  with  the  IIDP 
coordinating  center,  we  have  implemented  a  new  strategy  to  prioritize  T2D  islets  in  targeted  offers  and  to  obtain 
matched  non-diabetic  islet  offers  via  the  open  offers.  In  the  span  of  one  month,  this  strategy  yielded  islets  from 
a  well-matched  pair  of  T2D  and  ND  donors.  Thus,  we  are  confident  this  plan  will  result  in  higher  rates  of 
procurement  over  the  next  3-5  months  that  should  allow  us  to  meet  originally  proposed  milestones  and  ultimate 
deliverables  of  cell  type-specific  differential  expression  (Aim  Id)  and  cell  type-specific  differences  in 
regulatory  element  use  (Aim  2d)  in  T2D  vs,  ND  samples  central  to  this  proposal. 

Changes  that  had  a  significant  impact  on  expenditures 

We  have  incurred  less  material  and  experimental  costs  than  budgeted  in  this  current  reporting  period  due  to  the 
delays  in  obtaining  suitable  islets  matching  our  inclusion  criteria.  As  we  increase  our  T2D  islet  procurement 
rate  (and  the  corresponding  matched  ND  samples),  we  anticipate  that  these  decreased  costs  will  normalize  to 
our  projected  budget  in  the  upcoming  months. 


Significant  changes  in  use  or  care  of  human  subjects,  vertebrate  animals,  biohazards,  and/or  select 
agents: 

Significant  changes  in  use  or  care  of  human  subjects 


Nothing  to  Report 
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Significant  changes  in  use  or  care  of  vertebrate  animals. 


Not  applicable;  project  does  not  involve  the  use  of  vertebrate  animals. 


Significant  changes  in  use  of  biohazards  and/or  select  agents 


Nothing  to  Report 


6.  PRODUCTS:  List  any  products  resulting  from  the  project  during  the  reporting  period.  If  there  is  nothing 
to  report  under  a  particular  item,  state  “Nothing  to  Report.” 

•  Publications,  conference  papers,  and  presentations 

Report  only  the  major  publication(s)  resulting  from  the  work  under  this  award. 

Journal  publications.  List  peer-reviewed  articles  or  papers  appearing  in  scientific,  technical,  or 
professional  journals.  Identify  for  each  publication:  Author(s);  title;  journal;  volume:  year;  page 
numbers;  status  of  publication  ( published ;  accepted,  awaiting  publication;  submitted,  under  review; 
other);  acknowledgement  of  federal  support  (yes/no). 

1.  Lawlor  N,  Klietan  S,  Ucar  D,  and  Stitzel  ML.  Genomics  of  Islet  (Dys)function  and  Type  2  Diabetes.  Trends 
Genet.  2017  Apr;33(4):244-255.  PMID28245910; 

Acknowledgement  of  federal  support:  Yes 
Please  see  Appendix 

2.  Lawlor  N,  George  J,  Bolisetty  M,  Kursawe  R,  Sun  L,  V  S,  Kycia  I,  Robson  P,  Stitzel  ML.  Single  cell 
transcriptomes  identify  human  islet  cell  signatures  and  reveal  cell-type-specific  expression  changes  in  type  2 
diabetes.  2016.  Genome  Res.  Nov  18.  [Epub  ahead  of  print]  PMID:  27864352 

Acknowledgment  of  federal  support:  Yes 
Please  see  Appendix 

3.  Wu  Z,  Zhang  Y,  Stitzel  ML,  and  Wu  H.  Two-phase  differential  expression  analysis  for  single  cell  RNA-seq. 
Under  review,  Bioinformatics. 

4.  Bolisetty  M,  Stitzel  ML,  and  Robson,  P.  CellView:  Interactive  Exploration  of  High  Dimensional  Single 
Cell  RNA-seq  Data.  Under  review,  Bioinformatics. 

BioRxiv  link:  (http  ://biorxiv  .org/content/earl y/20 1 7/04/04/ 123810) 

Please  see  Appendix 

Books  or  other  non-periodical,  one-time  publications. 

Nothing  to  Report 


Other  publications,  conference  papers,  and  presentations. 


Nothing  to  Report 
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Website(s)  or  other  Internet  site(s) 


Nothing  to  Report 

Technologies  or  techniques 

Nothing  to  Report 

Nothing  to  Report 


Inventions,  patent  applications,  and/or  licenses 


Nothing  to  Report 


Other  Products 


7.  PARTICIPANTS  &  OTHER  COLLABORATING  ORGANIZATIONS 


What  individuals  have  worked  on  the  project? 


Name: 

Project  Role: 

Research  Identifier: 

Nearest  person  month  worked: 
Contribution  to  Project: 


Michael  Stitzel 
PD/PI 

http.V/orcid.  org/0000-0001  -5630-559X 

1 

Dr.  Stitzel  has  managed  the  project,  directing  both  the  experiments 
and  analyses  completed  by  Drs.  Kursawe  and  Mr.  Lawlor 


Name: 

Project  Role: 

Researcher  Identifier: 

Nearest  person  month  worked: 
Contribution  to  Project: 


Joshy  George 

Co-Investigator/Computational  Scientist 
http://orcid.org/0000-0001-8510-8229 
1 

Dr.  George  has  set  up  pipelines  and  trained  Mr.  Lawlor  to 
complete  the  computational  analyses. 
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Name: 

Project  Role: 

Research  Identifier: 

Nearest  person  month  worked: 
Contribution  to  Project: 


Name: 

Project  Role: 

Research  Identifier: 

Nearest  person  month  worked: 
Contribution  to  Project: 


Romy  Kursawe 
Research  Assistant  IV 

4 

Dr.  Kursawe  has  completed  all  of  the  experiments  for  the  project, 
including  processing  islets,  preparing  single  cell  suspensions, 
preparing  RNA,  transposing  nuclei,  and  preparing  ATAC-seq 
libraries 

Nathan  Lawlor 
Data  Analyst 

http://orcid.or8/0000-0003-3263-6057 

1 

Mr.  Lawlor  has  implemented  and  runs  the  published  analysis 
pipelines  for  single  cell  ATAC-seq  and  single  cell  RNA-seq  and 
internal  pipelines  established  by  Dr.  George.  In  addition,  he  has 
completed  differential  analyses  between  cell  (sub)populations  and 
has  participated  in  analyses  and  content  for  peer-reviewed 
publications  supported  by  this  funding 


Has  there  been  a  change  in  the  active  other  support  of  the  PD/PI(s)  or  senior/key  personnel  since  the  last 
reporting  period? 

Yes.  Changes  in  Other  Support  for  Key  Personnel  are  italicized  below. 

Stitzel,  Michael  L 
Active 


Supporting  Agency: 

NIH/NIDDK  5  R00  DK092251-05 

PI:  Stitzel 

Project  Title: 

Investigation  of  noncoding  variation  in  human  pancreatic  islets  and  their 
developmental  precursors 

Role: 

Principal  Investigator 

Effort:  6.00  CM 

Entire  Project: 

08/20/2014-07/31/2017 

$794,250 

Current  Year: 

08/01/2016-07/31/2017 

$249,000 

Project  Goals: 

The  goal  of  this  research  project  is  to  understand  the  role  that  genetic  variation 
in  non-protein  coding  regulatory  regions  of  the  genome  play  in  human 
pancreatic  islet  function  and  dysfunction. 

Specific  Aims: 

1 .  Determine  which  of  approximately  40  candidate  regulatory  regions  in  the 
human  genome  function  as  enhancers,  silencers,  or  insulators  in  human  islets;  2. 
Determine  which  elements  contain  variants  that  alter  gene  expression  in  adult 
human  islets;  3.  Determine  which  elements  contain  variants  that  alter  gene 
expression  in  pancreatic  precursor  cells. 

Overlap: 

None 

Contracting/ 

Grants  Officer: 

Sheryl  Sato  -  satos@extra.niddk.nih.gov 
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Supporting  Agency: 

Department  of  Defense 
W81XWH-16-1-0130 

PI:  Stitzel 

Project  Title: 

Single-Cell  Dissection  of  Human  Pancreatic  Islet  Dysfunction  in  Diabetes 

Role: 

Principal  Investigator 

Effort: 

0.50  CM 

Entire  Project: 

06/01/2016  - 11/30/2017 

$350,000 

Current  Year: 

06/01/2016  - 11/30/2017 

$350,000 

Project  Goals: 

The  goal  of  this  project  is  to  test  the  hypothesis  that  non-diabetic  (ND)  and  T2D 
islets  exhibit  distinct  cell  type-specific  transcript omic  and/or  epigenomic 
signatures  that  are  masked  by  the  cellular  heterogeneity  in  whole  islet  studies. 

Specific  Aims: 

Aim  1:  Identify  cell  type-specific  transcriptome  signature  differences  between 
non-diabetic  (ND)  and  T2D 

islets  using  single  cell  RNA-sequencing  (scRNA-seq);  Aim  2:  Identify  cell  type- 
specific  epigenome  (open  chromatin)  signature  differences  between  ND  and 

T2D  islets  using  single  cell  assay  for  transposase-accessible  chromatin  using 
sequencing  (scATAC-seq). 

Overlap: 

None 

Contracting/ 

Grants  Officer: 

Lisa  Sawyer  -  Iisa.m.sawyer22.civ@mail.mil 

Completed 

The  Jackson  Laboratory  Director's  Innovation  Fund  JAX-DIF-FY15-DUJB 

“Advancing  ATAC-seq  Data  Generation  and  Analysis  Pipeline  for  Epigenetic  Biomarker  Discovery” 
PI:  Banchereau/Ucar 
Role:  Co-Investigator 

The  Jackson  Laboratory  Director's  Innovation  Fund  T JL-DIF -FY 14-GRHGWC 
“Maximizing  Human  and  Mouse  Resources  to  Identify  Novel  Variants  for  Alzheimer’s  Disease” 

PL  Carter  /  Howell 
Role:  Co-Investigator 

The  Jackson  Laboratory  Director's  Innovation  Fund  TJL-DIF-FY14-GWC 
“Genetics  of  Molecular  Epigenetics  ” 

PL  Carter 

Role:  Co-Investigator 

George,  Joshy 
Active 


Supporting  Agency: 

NIH/NCI  5  R01  CA 1957 12-03 

PI:  Flavell  /  Palucka 

Project  Title: 

Humanized  mouse  models  to  dissect  in  vivo  the  interplay  between  melanoma 
and  the  immune  system 

Role: 

Co-Investigator 

Effort:  0.60  CM 

Entire  Project: 

05/14/2015  -  04/30/2018 

$1,304,550 

Current  Year: 

05/01/2017  -  04/30/2018 

$395,352 

Project  Goals: 

The  goal  of  this  project  is  to  credential  the  humanized  MISTRG  mouse  model  as 
a  platform  for  investigating  the  immune-mediated  mechanisms  of  tumorigenesis 
by  establishing  transcriptional  signatures  linked  with  melanoma  progression  and 
confirming  these  signatures  in  tumors  from  patients. 

Specific  Aims: 

1.  Determine  the  architecture  of  human  melanoma  tumors  and  their  impact  on 
human  tumor-infiltrating  immune  cells  in  vivo  in  MISTRG  mice  reconstituted 
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with  donor  CD34+  HPCs  and  melanoma  cell  lines;  2.  Define  how  human 
melanoma  alters  the  human  systemic  immunity  in  MISTRG  mice;  3.  Validate 
the  MISTRG  model  in  an  autologous  system  where  MISTRG  mice  are 
reconstituted  with  patient  CD34+  HPCs  and  autologous  tumors. 

Overlap: 

None 

Contracting/ 

Grants  Officer: 

Debra  Sowell  -  debra.sowell@nih.gov 

Supporting  Agency: 

NIH/NIA  5  R01  AG052608-02 

PI:  Banchereau 

Project  Title: 

Genomics  and  Epigenomics  of  the  Elderly  Response  to  Pneumococcal  Vaccines 

Role: 

Co-Investigator 

Effort:  0.96  CM 

Entire  Project: 

09/01/2016  -  04/30/2020 

$2,348,313 

Current  Year: 

05/01/2017-  04/30/2018 

$546,881 

Project  Goals: 

The  goal  of  this  project  is  to  dissect  the  age-related  changes  in  immune  cells  that 

affect  responses  to 

microbial  vaccination  in  the  elderly. 

Specific  Aims: 

1.  To  vaccinate  healthy  elderly  with  two  distinct  pneumococcal  vaccines,  collect 
longitudinal  blood  samples  and  assess  pneumococcal-specific  antibody 
responses;  2.  To  establish  the  transcriptional  and  epigenetic  profiles  of  elderly 
blood  immune  cells  linked  with  antibody  responses  to  pneumococcal 
vaccination;  3.  To  examine  the  functional  status  of  immune  cells  in  the  elderly 
stratified  according  to  their  pneumococcal  vaccine  responder  status. 

Overlap: 

None 

Contracting/ 

Grants  Officer: 

Mitchell  Whitfield  -  whitfieldm@od.nih.gov 

Supporting  Agency: 

N1H/NIAID  5  R01  A1121920-02 

PI:  Unutmaz 

Project  Title: 

Decoding  Immunological  perturbations  during  Chronic  Fatigue  Syndrome 

Role: 

Co-Investigator 

Effort:  1.80  CM 

Entire  Project: 

06/01/2016  -  05/31/2021 

$3,281,517 

Current  Year: 

06/01/2017  -  05/31/2018 

$642,090 

Project  Goals: 

The  goal  of  this  project  is  to  develop  a  detailed  functioned  and  genetic 
immunological  framework  that  can  be  used  to  decode  the  mechanisms  of 

Mvcdgic  Encephalomyelitis  and  Chronic  Fatigue  Syndrome  (ME/CFS)  and  to 
develop  robust,  cjuantitative  immune-biomarker  sets  for  predicting  disease 
susceptibility,  stratifying  patients  and  guiding  treatment  strategies. 

Specific  Aims: 

l)To  determine  the  frequencies  of  immune  cell  subsets  in  the  blood  of  a 
clinically  defined  ME/CFS  patient  cohort;  2)  To  assess  functional  capacity  of 
memory  T  cells,  innate  cells  and  the  differentiation  potential  of  naive  T  cells 
during  ME/CFS;  and  3)  To  determine  the  T  cell  and  innate  cell  subset? specific 
gene  and  IncRNA  transcripts  in  ME/CFS  patient  blood  samples. 

Overlap: 

None 

Contracting/ 

Grants  Officer: 

LeBrit Nickerson  -  lebrit.nickerson@nih.gov 

Supporting  Agency: 

NIH/NIAID  5  U01  AI124297-02 

PI: 

Banchereau 

Project  Title: 

Combination  Adjuvants  to  Activate  Human  Dendritic  Cell  Subsets  and  B  Cells 

Role: 

Co-Investigator 

Effort: 

0.96  CM 
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Entire  Project: 

04/15/2016  -  03/31/2021 

$3,177,983 

Current  Year: 

04/01/2017-03/31/2018 

$640,154 

Project  Goals: 

Our  goal  is  to  select  a  combination  adjuvant  using  functional  assays,  followed 
by  in-depth  investigation  of  molecular 

pathways  accounting  for  enhanced  immunogenicity.  Our  collaboration  with 
industry  will  enable  the  transition  of  the  selected  combination  adjuvant  to 
further  studies  of  human  vaccination. 

Specific  Aims: 

1.  Select  a  combination  adjuvant  that  induces  maximal  humoral  immunity  in 
vitro;  2.  Identify  the  molecular  mechanisms  through  which  the  selected 
combination  adjuvant  activates  DCs;  3.  Examine  the  adjuvant  effect  in  vivo  and 
validate  the  selected  pathway  using  CRISPR-CAS9  engineering  of  human 
CD34+HPCs. 

Overlap: 

None 

Contracting/ 

Grants  Officer: 

LeBrit Nickerson  -  lebrit.nickerson@nih.gov 

Supporting  Agency: 

Department  of  Defense 
W81XWH-16-1-0130 

PI:  Stitzel 

Project  Title: 

Single-Cell  Dissection  of  Human  Pancreatic  Islet  Dysfunction  in  Diabetes 

Role: 

Co-Investigator 

Effort:  0.40  CM 

Entire  Project: 

06/01/2016  - 11/30/2017 

$350,000 

Current  Year: 

06/01/2016  - 11/30/2017 

$350,000 

Project  Goals: 

The  goal  of  this  project  is  to  test  the  hypothesis  that  non-diabetic  (ND)  and  T2D 
islets  exhibit  distinct  cell  type-specific  transcriptomic  and/or  epigenomic 
signatures  that  are  masked  by  the  cellular  heterogeneity  in  whole  islet  studies. 

Specific  Aims: 

Aim  1:  Identify  cell  type- specific  transcriptome  signature  differences  between 
non-cliabetic  (ND)  and  T2D 

islets  using  single  cell  RNA-sequencing  (scRNA-seq);  Aim  2:  Identify  cell  type- 
specific  epigenome  (open  chromatin )  signature  differences  between  ND  and 

T2D  islets  using  single  cell  assay  for  transposase-accessible  chromatin  using 
sequencing  (scATAC-seq). 

Overlap: 

None 

Contracting/ 

Grants  Officer: 

Lisa  Sawyer  -  lisa.ni.sawyer22.civ@mail.mil 

Supporting  Agency: 

Department  of  Defense 

W81XWH- 17- 1-0010 

PI:  Palucka 

Project  Title: 

Molecular  Mechanisms  of  Human  TNBC  Metastasis  In  Vivo 

Role: 

Computational  Scientist 

Effort:  0.60  CM 

Entire  Project: 

02/01/2017  -  01/31/2020 

$1,386,446 

Current  Year: 

02/01/2017  -  01/31/2018 

$460,079 

Project  Goals: 

Our  overall  goal  is  to  test  the  hypothesis  that  TNBC  metastatic  dissemination  is 
driven  by  a  specific  cancer-immune  cell  interaction  that  can  be  identified  and 
targeted  for  therapy. 

Specific  Aims: 

1.  Establish  the  capacity  of  TNBC  PDXs  to  disseminate  and  form  metastasis  in 
hNSG-SGM3  mice;  2.  Identify  immune  cells  and  their  molecular  signatures 
linked  with  TNBC  PDX  metastasis. 

Overlap: 

None 

Contracting/ 

Grants  Officer: 

Susan  Dodd  -  susan.l.dodd4.civ@mail.mil 
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Completed 


None 


What  other  organizations  were  involved  as  partners? 


Nothing  to  Report 


8.  SPECIAL  REPORTING  REQUIREMENTS 

COLLABORATIVE  AWARDS:  For  collaborative  awards,  independent  reports  are  required  from  BOTH  the 
Initiating  PI  and  the  Collaborating/Partnering  PI.  A  duplicative  report  is  acceptable;  however,  tasks  shall  be 
clearly  marked  with  the  responsible  PI  and  research  site.  A  report  shall  be  submitted 
to  https  ://ers.amedd.army.mil  for  each  unique  award. 

QUAD  CHARTS:  If  applicable,  the  Quad  Chart  (available  on  https://www.usamraa.army.mil)  should  be 
updated  and  submitted  with  attachments. 


9.  APPENDICES: 

a)  Lawlor  N,  Khetan  S,  Ucar  D,  and  Stitzel  ML.  Genomics  of  Islet  (Dys)function  and  Type  2  Diabetes. 
Trends  Genet.  2017  Apr;33(4):244-255.  PMID28245910; 

b)  Lawlor  N,  George  J,  Bolisetty  M,  Kursawe  R,  Sun  L,  V  S,  Kycia  I,  Robson  P,  Stitzel  ML.  Single  cell 
transcriptomes  identify  human  islet  cell  signatures  and  reveal  cell-type-specific  expression  changes  in 
type  2  diabetes.  2016.  Genome  Res.  Nov  18.  [Epub  ahead  of  print]  PMID:  27864352 

c)  Bolisetty  M,  Stitzel  ML,  and  Robson,  P.  CellView:  Interactive  Exploration  of  High  Dimensional  Single 
Cell  RNA-seq  Data.  Under  review,  Bioinformatics. 
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Trends  in  Genetics 


Cell 


Review 

Genomics  of  Islet  (Dys) 
function  and  Type  2  Diabetes 

Nathan  Lawlor,1  Shubham  Khetan,1,2  Duygu  Ucar,1,3  and 
Michael  L.  Stitzel1,2,3’* 

Pancreatic  islet  dysfunction  and  beta  cell  failure  are  hallmarks  of  type  2  diabe¬ 
tes  mellitus  (T2DM)  pathogenesis.  In  this  review,  we  discuss  how  genome-wide 
association  studies  (GWASs)  and  recent  developments  in  islet  (epi)genome  and 
transcriptome  profiling  (particularly  single  cell  analyses)  are  providing  novel 
insights  into  the  genetic,  environmental,  and  cellular  contributions  to  islet  (dys) 
function  and  T2DM  pathogenesis.  Moving  forward,  study  designs  that  interro¬ 
gate  and  model  genetic  variation  [e.g.,  allelic  profiling  and  (epi)genome  editing] 
will  be  critical  to  dissect  the  molecular  genetics  of  T2DM  pathogenesis,  to  build 
next-generation  cellular  and  animal  models,  and  to  develop  precision  medicine 
approaches  to  detect,  treat,  and  prevent  islet  (dys)function  and  T2DM. 

Lay  of  the  Land:  (Functional)  Genomic  Landscape  of  Islets  and  T2DM 

T2DM  is  a  complex  metabolic  disorder  with  both  genetic  and  environmental  components.  It 
results  from  the  dysfunction  and  loss  of  insulin-secreting  beta  cells  in  the  endocrine  pancreas 
(Islets  of  Langerhans)  as  they  work  to  secrete  more  insulin  to  counteract  insulin  resistance  in 
peripheral  tissues  (adipose,  skeletal  muscle,  and  liver).  Ultimately,  T2DM  manifests  as  uncon¬ 
trolled  elevations  in  blood  glucose  levels.  GWAS  (see  Glossary)  have  systematically  identified 
hundreds  of  single  nucleotide  variants  (SNVs),  representing  >150  regions  of  the  genome 
(loci)  [1],  that  are  associated  with  T2DM  risk  and  differences  in  T2DM-related  quantitative 
metabolic  traits,  such  as  insulin,  proinsulin,  and  glucose  levels.  Most  (>90%)  of  these  SNVs 
reside  in  noncoding  regions  of  the  genome.  In  parallel,  functional  (epi)genomics  approaches  to 
map  open  chromatin  using  DNase  I  hypersensitive  site  sequencing  (DNase-seq),  assay 
fortransposase-accessible  chromatin  sequencing  (ATAC-seq),  and  histone  modification 
and  transcription  factor  (TF)-binding  patterns  using  chromatin  immunoprecipitation 
sequencing  (ChIP-seq)  have  identified  genome-wide  location  of  regulatory  elements  (REs), 
such  as  promoters,  enhancers,  and  insulators,  in  >150  human  cell  types  and  tissues.  T2DM 
SNVs  are  significantly  and  specifically  enriched  in  islet-specific  REs  [2-7],  suggesting  that 
changes  in  islet  RE  activity  and  target  gene  expression  are  a  common  mechanism  underlying 
the  molecular  genetics  of  islet  dysfunction  and  T2DM  [8]  (Figure  1A).  Indeed,  recent  studies 
have  identified  putative  factors  binding  these  REs  and  have  detected  allelic  effects  on  their 
binding  and  target  gene  expression  [9-11]. 

In  this  review,  we  discuss  how  recent  studies  are  improving  our  understanding  of  how  islet  REs 
are  perturbed  by  SNVs  contributing  to  T2DM  risk  [1,12-19]  and  are  elucidating  the  transcrip¬ 
tional  underpinnings  of  islet  responses  to  (patho)physiological  environmental  changes,  such  as 
aging,  circadian  rhythms,  Western  diet  and  lifestyle,  as  well  as  oxidative,  endoplasmic  reticulum 
(ER),  and  inflammatory  stress  responses  [20-25].  We  explore  how  studies  applying  next- 
generation  sequencing  (NGS)  to  profile  individual  cells  are  improving  our  comprehension  of  islet 
biology  and  reshaping  our  view  of  T2DM  pathogenesis.  Finally,  we  examine  similarities  and 
differences  between  mice  and  humans  in  the  ‘omics  of  islet  function  and  T2DM  (summarized  in 
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Recent  studies  reaffirm  the  common 
variant  origins  of  T2DM  genetic  risk. 
Variants  overlap  nonooding  genomic 
regions,  implicating  regulatory  defects 
in  T2DM  etiology. 

Environmental  stressors  are  asso¬ 
ciated  with  changes  in  gene  expres¬ 
sion  programs  leading  to  T2DM 
progression. 

Single  cell  sequencing  technologies 
permit  investigation  of  islet  cell  type 
transoriptomes  and  epigenomes  with 
single  cell  resolution  and/or  precision. 
Such  methods  provide  greater  insight 
into  cell  type-specific  perturbations 
and  their  roles  in  T2DM. 
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Figure  1 .  Genomic  Effects  of  Genetic  and  Environmental  Perturbations  Contributing  to  Pancreatic  Islet 
Dysfunction  and  Type  2  Diabetes  Mellitus  (T2DM).  (A)  DNA  single  nucleotide  variants  (SNVs)  may  enhance  (gain-of- 
function)  or  diminish  (loss-of-function)  transcription  element  (e.g.,  enhancer)  activity  and  islet  gene  expression.  Most 
T2DM-associated  SNVs  reside  in  noncoding  regions  of  the  genome  and  overlap  islet  regulatory  elements  (REs) 
[2,3,12,14,15,32,47],  implicating  disruptions  in  gene  regulatory  network  components  as  a  central  molecular  feature  in 
disease  pathogenesis.  A  subset  of  SNVs  has  been  linked  to  changes  in  basal  islet  gene  expression  [11,31].  (B) 
Environmental  factors,  such  as  inflammation,  diet,  aging,  circadian  rhythms,  and  stress,  may  also  influence  RE  activity, 
resulting  in  altered  and/or  novel  transcription  of  genes  essential  for  islet  function  [20-25,48-50,57,58],  Abbreviations:  TF, 
transcription  factor;  TSS,  transcription  start  site 


Figure  2,  Key  Figure).  Throughout,  we  highlight  future  challenges  and  opportunities  and  offer 
perspectives  on  how  these  recent  developments  set  the  stage  for  precision  medicine 
approaches  to  understand,  treat,  and  prevent  T2DM. 

Homing  in  on  T2DM  Genetic  Risk  and  Architecture 

Since  initial  T2DM  GWAS  reports  in  2007  [26-29],  the  list  of  genomic  loci  in  which  sequence 
variation  contributes  to  T2DM  risk  and  variability  in  quantitative  measures  of  pancreatic  islet 
function  has  grown  to  over  1 50  [1 ,1 4,30].  Associated  SNVs  at  each  locus  contribute  modestly 
to  increased  T2DM  risk  [odds  ratios  (OR)  1 .05-1 .75].  Together,  these  loci  only  explain  a  fraction 
of  T2DM  heritability  [13,14].  Genetic  consortia  continue  to  dissect  the  genetic  architecture  of 
T2DM  using  larger  cohorts  with  increasing  ethnic  diversity  and/or  representation.  Recent  efforts 
have  reported  [12,14,30,99]  fewer  ‘new’  T2DM  loci  (A/=10)  than  previous  studies.  Importantly, 
however,  they  are  refining  the  genetic  signals  at  known  (previously  associated)  T2DM  loci  to 
define  ‘credible  sets’  of  single  nucleotide  polymorphisms  (SNPs)  that  are  the  most 
probable  causal  and/or  functional  SNPs  driving  the  association  and,  consequently,  the  result¬ 
ing  molecular  and/or  phenotypic  consequences. 

The  GOT2D  and  T2D-GENES  consortia  sought  to  identify  less  common  SNVs  (0. 1  %<MAF<5%) 
with  larger  effect  size  that  may  underlie  common  variant  associations  or  may  account 
for  some  of  the  T2DM  ‘missing  heritability’  using  a  combined  whole-genome  sequencing 
(WGS),  exome  sequencing,  and  genotype  imputation  approach  [14],  These  efforts  identified 
protein-coding  variants  and/or  mutations  that  are  the  most  likely  causative  variant  or 
effector  transcripts  for  12  out  of  78  GWAS  loci,  confirming  five  nominated  in  previous 
studies  ( PPARG ,  KCNJ1 1-ABCC8,  SLC30A8,  GCKR,  and  PAM  loci)  and  identifying  seven 


Glossary 

Assay  for  transposase-accessible 
chromatin  sequencing  (ATAC- 
seq):  a  technique  used  to  profile 
regions  of  open  chromatin  from  small 
cell  numbers. 

Chromatin  immunoprecipitation 
sequencing  (ChIP-seq):  a  method 
used  to  study  DNA-protein 
interactions. 

Chromatin  interaction  analysis  by 
paired-end  tag  sequencing  (ChlA- 
PET):  a  method  used  to  study  3D 
chromatin  interactions  genome  wide. 
CpG  sites;  areas  of  DNA  containing 
a  cytosine  nucleotide  directly  linked 
to  a  single  phosphate  group  and 
guanine  nucleotide.  These  sites  are 
often  methylated  and  influence 
transcription. 

Credible  sets  of  SNPs:  lists  of 
sequence  variants  with  95% 
posterior  probability  of  containing  a / 
the  causal  disease-associated  SNP 
(s)  [99], 

Deconvolution:  a  statistical 
framework  to  resolve  a 
heterogeneous  mixture  into  its 
constituent  elements. 
Dedifferentiation:  the  process  in 
which  a  mature  differentiated  cell 
type  reverts  to  an  earlier 
developmental  and/or  precursor 
state. 

DNA  methylation:  molecular 
process  wherein  a  methyl  group  is 
covalently  attached  to  a  DNA  base 
without  altering  the  DNA  sequence. 
DNase  I  hypersensitive  site 
sequencing  (DNase-seq):  a 
method  used  to  characterize 
regulatory  and  open  chromatin 
regions  of  the  genome. 

Expression  quantitative  trait  loci 
(eQTL):  approach  to  link  sequence 
variation  at  a  position  in  the  genome 
to  expression  of  target  gene(s). 
Genome-wide  association  study 
(GWAS):  statistical  association  of 
sequence  variation  with  disease  risk 
or  variability  in  a  measurable 
phenotypic  trait  and/or  feature. 
Glycated  hemoglobin  (HbAIC):  a 
type  of  hemoglobin  modification  that 
is  measured  to  determine  plasma 
glucose  concentration. 
RNA-sequencing  (RNA-seq): 
measures  the  amount  of  RNA  in  a 
sample  at  a  given  time. 

Single  nucleotide  polymorphism 
(SNP):  nucleotide  variation  at  a 
specific  location  in  the  genome  that 
exists  with  >5%  frequency  in  the 
population. 
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new  ones  (FES,  TM6SF2,  and  RREE11  in  the  PRCT,  CILP2,  and  SSR1  loci,  respectively,  and 
TSPAN8,  TFIADA,  HNF1A,  and  HNF4A).  For  the  remaining  loci,  noncoding  SNVs  constitute 
the  putative  causal  SNVs.  Comparison  of  multiple  genetic  models  with  the  empirical  data 
generated  in  this  study  suggest  that  a  long  tail  of  common  variants  with  lower  effect  sizes  may 
comprise  the  missing  heritability  and  reaffirms  the  importance  of  common,  regulatory  varia¬ 
tion  in  the  genetic  architecture  of  T2DM  (see  Outstanding  Questions).  Perhaps  most  impor¬ 
tantly,  this  immense  effort  has  narrowed  the  list  of  putative  causal  SNVs  to  a  handful  for  five 
loci  and  by  50%  on  average  for  the  78  T2DM-associated  autosomal  loci  investigated  [14]. 
Similar  themes  and  reductions  in  credible  sets  were  reported  for  fasting  glucose-  and  insulin- 
associated  loci  [30]. 

Ongoing  islet  epigenomic  and  transcriptomic  analyses  are  progressively  defining  the  regulatory 
potential  of  variant  loci,  identifying  SNV-RE  overlaps,  and  nominating  potential  target  genes, 
whose  dysfunction  is  likely  to  contribute  to  T2DM  [2,3,1 1 ,12,14,15,30-32].  Open  chromatin 
(DNase-seq,  ATAC-seq)  and  histone  modification  and/or  TF-binding  profiling  (ChIP-seq)  indi¬ 
cate  that  T2DM  and  related  trait-associated  SNVs  are  especially  prominent  in  islet  distal  REs 
and  stretch/super  enhancers  [2,3,5,33,34],  Due  to  the  long  distances  over  which  REs  might 
act,  additional  work  to  elucidate  the  target  genes  of  T2DM  SNV-containing  REs  is  needed. 
Chromosome  conformation  capture  techniques,  such  as  3C,  4C,  5C  [35],  Hi-C  [36],  chro¬ 
matin  interaction  analysis  by  paired-end  tag  sequencing  (ChlA-PET)  [37],  and  HiCHIP 
[38]  will  be  important  components  to  effectively  map  interactions  between  REs  and  their  target 
genes  (see  Outstanding  Questions).  In  two  separate  studies,  RNA-sequencing  (RNA-seq)  of 
89  [31]  and  1 18  [1 1]  human  islet  samples  identified  616  and  2341  expression  quantitative 
trait  loci  (eQTLs),  respectively.  These  analyses  were  the  first  studies  linking  SNVs  to  gene 
expression  changes  in  islets  to  define  the  putative  genetic  control  of  islet  function  and  failure. 
However,  of  the  216  eQTLs  common  to  both  studies,  only  14  overlapped  with  T2DM- 
associated  loci  [11].  This  may  be  due  to  power  limitations  and  an  inability  to  detect  eQTLs 
beyond  their  primary  signal.  Alternatively,  this  relatively  low  overlap  could  suggest  that  T2DM 
SNVs  affect  islet  physiological  or  pathophysiological  responses,  not  just  basal  expression,  as 
has  been  measured  to  date.  Indeed,  a  recent  study  suggested  that  several  putative  T2DM 
GWAS  genes  are  regulated  by  NFAT,  a  TF  involved  in  calcineurin  signaling  responses  [39]. 
Alternatively,  the  detection  of  eQTLs  overlapping  T2DM-associated  SNVs  in  peripheral  tissues, 
such  as  skeletal  muscle  [40]  and  adipose  [41]  tissue,  reminds  us  that  these  other  metabolic 
tissues  should  not  be  ignored  in  the  T2DM  molecular  genetics  and  pathogenesis,  and  warrant 
further  investigation  of  genomic  variation  in  these  tissues. 

Recent  islet  studies  suggest  that  regulatory  noncoding  RNAs  (ncRNAs)  contribute  to  diabetes 
progression  and  beta  cell  (dys)function  [31 ,42,43].  Aberrant  expression  of  1 7  long  noncoding 
(IncRNAs)  has  been  associated  with  glycated  hemoglobin  (HbAlc)  levels  [31].  This  study 
identified  eQTLs  for  two  of  these  transcripts  (LOC283  7  77  and  SNHG5),  but  the  eQTL  SNVs  did 
not  overlap  withT2DM  SNVs  [31].  Similarly,  a  study  by  Moran  and  colleagues  identified  nine  out 
of  55  T2DM-associated  loci  that  contained  IncRNAs  located  within  150  kb  of,  but  not  directly 
overlapping,  the  reported  lead  SNVs  [42].  In  the  KCNQ1  locus,  T2DM  risk  SNVs  overlap  both 
KCNQ1  and  KCNQ10T1  [43,44],  a  long  intergenic  noncoding  RNA  (lincRNA)  also  found  to  be 
significantly  induced  in  T2DM  islets  [42],  We  anticipate  that  additional  links  will  emerge  in  the 
coming  years.  Other  studies  suggest  that  islet  IncRNA  alterations  could  also  contribute  to  type 
1  diabetes  mellitus  (T 1  DM),  because  a  T1DM  GWAS  SNV  (rs941576)  was  identified  in  the 
MEG3  lincRNA  locus  [43,45] .  Functional  analyses  in  human  islets  and  rodent  models  will  clarify 
the  roles  of  these  ncRNAs  in  islet  development,  (dys)function,  and  diabetes. 

DNA  methylation  studies  of  nondiabetic  (ND)  and  T2DM  islets  have  suggested  that  epigenetic 
dysregulation  promotesT2DM  development  [46,47],  DNA  methylation  profiling  of  1 5  T2DM  and 


Single  nucleotide  variant  (SNV): 

changes  in  a  given  nucleotide 
sequence  in  the  genome. 
Stretch/super  enhancers: 

extended  (>3  kb)  regions  of  the 
genome  marked  by  enhancer 
chromatin  states;  enriched  near 
genes  that  are  important  for  cell  type 
identity  and  cell  type-specific 
functions. 

Subpopulation:  a  subset  of  cells 
within  a  tissue  distinguished  by  the 
expression  of  specific  marker  genes 
and/or  proteins. 

Trans-differentiation:  the  process 
in  which  a  mature  cell  type  converts 
into  another  mature  cell  type. 
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Key  Figure 

Converging  and  Diverging  Genetic,  Environmental,  and  Cellular  Aspects  of  Islet  (Dys)function  and 
Type  2  Diabetes  Mellitus  (T2DM)  in  Mice  and  Humans 
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^  basal  insulin  secretion  [58] 
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Figure  2.  Parallel  analyses  of  human  and  mouse  islets  are  revealing  important  similarities  [15,23,39,48-51 ,70,76,79,80,83-85]  (A)  and  differences  [48,57-58,61- 
63,69-70,75-76,79-80,83]  (B,C)  between  molecular  features  of  islet  identity  and  (dys)function  in  mice  and  humans.  Black  text  highlights  significant  findings  regarding 
islet  cellular  composition  and  identity.  Blue  text  highlights  longitudinal  and/or  comparative  analyses  of  genome-wide  molecular  data  sets  and  environmental  effects  on 
islet  (dys)function.  These  features  reaffirm  the  value  of  modeling  T2DM  in  mice  to  delineate  important  species-specific  differences  in  islet  biology  that  may  reflect  distinct 
T2DM  causative  mechanisms.  Abbreviations:  f,  increase;  f,  decrease;  GWAS,  genome-wide  association  study  RE,  regulatory  element;TF,  transcription  factor;  T1  DM, 
type  1  diabetes  mellitus 


34  ND  islets  using  the  lllumina  450BeadChip  identified  1 649  differentially  methylated  CpG  sites 
for  853  genes,  17  of  which  reside  in  T2DM-associated  loci  [46],  Surprisingly,  most  (97%)  of 
these  CpG  sites  were  hypomethylated  in  T2DM  islets,  suggesting  that  they  suffer  from 
decreased  methyl  donor  levels  or  decreased  activity  of  DNA  methyltransferases. 
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Genomics  of  Islet  Responses  to  Environmental  Changes  and  T2DM 
Pathogenesis 

Intrinsic  and  extrinsic  environmental  changes,  such  as  aging,  and  Western  diet  and/or  lifestyle, 
respectively,  are  linked  to  islet  dysfunction  and  T2DM  risk  [23,48-50]  (Figure  IB).  Multiple 
groups  have  begun  to  characterize  the  genomic  effects  of  these  environmental  inputs  and 
insults  on  islets.  Transcriptome  profiling  of  adult  and  juvenile  islet  beta  cells  identified  565  (209 
up,  356  down)  and  6123  (2083  up,  4040  down)  differentially  expressed  genes  in  humans  and 
mice,  respectively  [48,49],  Signatures  of  decreased  proliferative  capacity  in  aged  islets  and/or 
beta  cells  were  apparent  in  both  species,  perhaps  best  illustrated  by  increased  CDKN2A/B 
expression,  a  gene  cluster  with  established  cellular  senescence  functions  and  implicated  as 
Type  2  Diabetogenes’  for  a  T2DM  GWAS  signal  on  9p21  [48,49,51].  Unexpectedly,  tran¬ 
scriptome  and  epigenome  signatures  suggested  superior  insulin  secretory  capacity  of  adult 
islets,  which  was  confirmed  functionally  by  glucose-stimulated  insulin  secretion  (GSIS)  assays 
[48,49],  DNA  methylation  and  histone  profiling  indicated  that  these  expression  differences  were 
largely  mediated  by  chromatin  remodeling  and  epigenetic  modification  of  distal  Res,  such  as 
enhancers.  Using  whole-genome  bisulfite  sequencing  (WGBS),  Avrahami  and  colleagues 
identified  approximately  1 4  368  aging-related  differentially  methylated  regions  (DMRs)  between 
the  beta  cells  of  juvenile  and  adult  mice.  DMRs  overlapping  distal  REs  outnumbered  those 
overlapping  promoters  3:1,  and  exhibited  larger  changes  in  magnitude  of  methylation.  Distal 
DMRs  that  lost  methylation  with  aging  were  enriched  for  binding  sites  of  important  islet  TFs, 
such  as  Foxa2,  Neurodl ,  and  Pdxl ,  suggesting  these  factors  mediate  the  expression  differ¬ 
ences  and  improved  functionality  in  adult  islets.  Finally,  genes  showing  differential  expression  in 
adult  islets  were  accompanied  by  differential  methylation  at  nearby  distal  REs  more  often  than 
at  their  promoters.  These  data  suggest  that,  in  addition  to  their  importance  in  T2DM  genetic 
risk,  enhancers  also  govern  important  transcriptional  regulatory  changes  accompanying  or 
mediated  by  aging. 

Circadian  rhythm  links  behavior  and  metabolism  to  day-night  cycles.  Notably,  insulin  secretion 
oscillates  with  a  circadian  periodicity.  Analysis  of  mouse  islet  transcriptomes  revealed  that 
approximately  27%  of  the  beta  cell  transcriptome  (A/=3905  genes)  demonstrated  circadian 
oscillation,  including  genes  responsible  for  insulin  synthesis,  transport,  and  stimulated  exocy- 
tosis  [50].  The  human  orthologs  of  481  of  these  genes  exhibited  circadian  oscillations  in  human 
islets.  ChIP-seq  identified  742  out  of  3905  of  these  oscillatory  genes  as  direct  targets  of  the 
circadian  clock  TFs  CLOCK  and  BMAL1 .  As  with  aging,  most  differential  sites  were  at  distal 
REs.  Beta  cell-specific  deletion  of  Bmall  resulted  in  islet  failure  and  diabetes  in  mice.  This  study 
demonstrates  the  importance  of  circadian  rhythms  in  islet  function  and  suggests  that  genetic  or 
environmental  perturbation  of  this  program  contribute  to  T2DM  risk  and  pathophysiology. 
GWAS  results  suggest  that  this  could  be  the  case,  because  SNVs  in  the  CRY2  locus,  a 
component  of  the  circadian  machinery,  and  MTNR1B,  a  gene  encoding  a  melatonin  receptor, 
are  associated  with  altered  islet  function  and  T2DM  [1 ,52],  It  will  be  interesting  to  see  whether 
genetic  perturbations  in  circadian  clock  TFs  or  their  binding  sites  emerge  as  one  of  the 
molecular  mechanisms  underlying  T2DM  GWAS. 

Maternal  nutrition  and  in  utero  stresses  have  been  linked  to  T2DM  risk  for  offspring  in  humans 
and  rodents  [23,53-55].  Although  changes  in  fetal  nutrition  are  suggested  to  influence  offspring 
metabolism  via  epigenetic  modifications  [23,56],  the  genome-wide  effects  on  the  islet  (epi) 
genome  have  not  been  determined.  Similarly,  stress  responses  to  elevated  oxidative  and/or  ER 
stress  lead  to  islet  failure,  impaired  insulin  secretion,  and  T2DM  susceptibility  [57-59].  Ulti¬ 
mately,  these  responses  converge  on  the  nucleus  and  involve  the  redistribution  or  covalent 
modifications  of  master  TFs  (MAFB,  NKX6-1 ,  and  PDX1)  or  stress  response  factors  (FOXOI , 
ATF4,  and  HIF1  alpha)  [20,22,53,57,58].  (Epi)genomic  analyses  of  these  stress  responses  are 
warranted  and  may  reveal  important  connections  between  T2DM  SNVs  and  altered  islet  stress 
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responses.  Moving  forward,  it  will  be  crucial  to  understand  the  extent  to  which  genetic  and 
epigenetic  changes  interact  in  T2DM  pathogenesis  (see  Outstanding  Questions).  Response 
QTL  (reQTL)  and  epigenome-wide  association  studies  (EWAS)  [56]  should  provide  these 
important  links  (see  Outstanding  Questions).  Indeed,  studies  of  SNV  effects  on  immune  cell 
responses  identified  121  reQTLs,  38  of  which  overlapped  autoimmune  disease-associated 
SNVs  [60].  Specific  factor(s)  and  pathway(s)  activated  by  insulin  resistance  appear  to  differ 
between  mouse  and  human  islets  [57,58]  (Figure  2);  thus,  we  emphasize  that  caution  must  be 
taken  in  study  design  and  interpretation  to  interrogate  this  and  possibly  other  islet  responses. 

Deconstructing  Pancreatic  Islet  Cellular  and/or  Functional  Heterogeneity 

Islets  comprise  1-5%  of  the  pancreas  and  consist  of  at  least  five  endocrine  cell  types  perform¬ 
ing  coordinated  but  distinct  functions  and  each  producing  a  unique  hormone  in  the  islet:  beta 
(insulin),  alpha  (glucagon),  delta  (somatostatin),  gamma  (pancreatic  polypeptide),  and  epsilon 
(ghrelin)  cells  [61-64].  Precise  understanding  of  islet  molecular  changes  during  T2DM  devel¬ 
opment  is  likely  complicated  by  variability  in  islet  cell  type  composition.  On  average,  islets 
comprise  55%  beta  cells,  35%  alpha,  1 0%  delta,  and  less  than  5%  and  1  %  gamma/PP  and 
epsilon  cells,  respectively  [61-63].  However,  this  can  vary  considerably  between  donors,  with 
ranges  of  28.4-76.2%,  23.8-71 .6%,  and  2.4-1 2%  for  beta,  alpha,  and  delta  cell  compositions, 
respectively  [61]  (Figure  2).  This  cellular  heterogeneity,  combined  with  donor-to-donor  variabil¬ 
ity,  masks  the  molecular  repertoire  of  each  cell  type  and  impedes  clear  understanding  of  the 
molecular  programs  perturbed  in  each  cell  type  by  T2DM  pathogenesis. 

Until  recently,  most  studies  had  focused  on  epigenetic  and  transcriptional  analyses  of  whole 
islets  or,  at  the  expense  of  other  cell  types,  beta  cells.  However,  recent  studies  demonstrating 
roles  for  alpha  [65-67]  and  delta  cells  [68-71]  in  modulating  beta  cell  function  and/or  resilience 
and  in  T2DM  pathogenesis  are  fueling  renewed  interest  in  these  cell  types.  First  attempts  to 
overcome  these  obstacles  and  understand  the  molecular  repertoire  of  each  islet  cell  type 
focused  on  transcriptomic  analyses  of  sorted  and  enriched  cell  type  populations  [61 ,72-74]. 
However,  such  methods  were  unable  to  effectively  isolate  and  enrich  the  less  abundant 
nonbeta  cells  [75],  leaving  much  of  the  functional  genomic  landscape  of  islets  imprecisely 
assigned  and/or  classified  or,  in  the  case  of  rarer  islet  cell  types,  undefined. 

Within  the  past  year,  multiple  groups  have  applied  single  cell  transcriptome  profiling  to  islets  to 
begin  to  address  questions  about  islet  physiology  [75-83]  (see  Outstanding  Questions)  with 
single  cell  resolution,  such  as:  (i)  what  is  the  gene  repertoire  of  each  islet  cell  type?  (ii)  Does  the 
gene  repertoire  reveal  any  new  and/or  unexpected  roles  for  each  cell  type  in  islet  (patho) 
physiology?  (iii)  Are  there  novel  cell  types  or  unappreciated  subpopulations  in  islets?  These 
studies  are  providing  new  appreciation  of  the  repertoire  of  both  islet  beta  and  nonbeta  cells. 
Given  that  much  of  the  beta  cell  transcriptional  repertoire  has  been  extensively  studied  [61 ,72- 
74],  several  features  have  been  validated,  including  genes  involved  in  cell  survival  and/or 
maturation  ( PDX1 ),  regulation  of  insulin  secretion  ( RGS16 ,  SYT13,  and  ENTPD3 ),  and  diabetes- 
associated  genes  ( DLK1 ,  MEG3,  and  SLC2A2)  [75,76,78-81 ,83].  Unique  expression  of  genes 
encoding  TFs  ( IRX2 ),  membrane  glycoproteins  ( DPP4 ),  and  hormone  transporters  ( TTR ) 
were  also  validated  in  alpha  cells.  Analysis  of  single  alpha  cell  transcriptomes  uncovered 
signatures  involved  in  wound  healing  (FAP),  blood  clotting  (F70),  and  tissue  biogenesis  (LOXL4) 
[75,76,78-81,83],  suggesting  that  they  share  functions  akin  to  pancreatic  fibroblast  and/or 
mesenchymal  cells. 

Single  cell  profiling  has  provided  new  views  of  the  roles  of  delta  and  PP/gamma  cells  in  islet 
physiology  and  the  molecular  genetics  of  islet  failure  and  diabetes.  For  example,  these  studies 
revealed  that  delta  cells  uniquely  express  appetite-suppressing  leptin  ( LEPR )  and  appetite- 
stimulating  ghrelin  ( GHSR )  hormone  receptors  [75,79,80],  implicating  them  as  the  integrators 


22 


Trends  in  Genetics,  April  2017,  Vol.  33,  No.  4  249 


Trends  in  Genetics 


Cell 


and  regulators  of  these  pathways  in  the  islet.  GHSR  functionality  has  been  demonstrated  in 
both  human  and  mouse  delta  cells  [70].  LEPR  expression  is  unique  to  human  delta  cells, 
suggesting  that  these  cells  uniquely  mediate  the  leptin  response  in  human  islets 
[70,75,76,79,80]  (Figure  2).  Expression  of  genes  associated  with  congenital  hyperinsulinemia 
(CHI)  ( UCP2  and  HADH)  in  delta  cells  further  implicates  this  cell  type  in  the  molecular  genetics  of 
CHI  [75].  PP/gamma  cell  transcriptomes  exhibited  enrichment  of  genes  involved  in  neuronal 
development  ( MEIS2  and  FEV)  [75,78-80]  and  serotonin  catalysis  and  reuptake  (TPH1  and 
SLC6A4)  [75,79,80,83].  T ogether,  these  findings  suggest  that  delta  and  PP/gamma  cells  act  as 
the  ‘brains'  of  the  pancreatic  islets,  capable  of  receiving  and  integrating  various  neuronal 
signals  to  coordinate  islet  function.  Due  to  their  scarcity  in  human  pancreatic  islets  (<1  %  of  islet 
volume),  our  knowledge  of  the  epsilon  cell  repertoire  and  its  putative  function(s)  remain 
speculative.  Nonetheless,  the  insights  gleaned  from  these  initial  studies  undoubtedly  motivate 
follow-up  studies  that  continue  transitioning  from  whole-islet  to  functional  constituent  cell 
studies.  Identification  of  genes  encoding  cell  type-specific  surface  markers  (beta,  LRRTM3 
and  CASR;  alpha,  DPP4  and  PLCEH,  delta,  LEPR,  GHSR,  and  ERBB4 ;  PP/gamma,  SLC6A4 
and  PTGFR ;  and  epsilon,  ANXA13)  [75,79]  provide  new  targets  that  may  be  exploited  for  more 
accurate  purification  of  each  islet  cell  type  and  analysis  of  its  specific  responses  to  genetic  and 
environmental  stressors. 

Islet  Subpopulations  and  Cell  Type  Heterogeneity 

Detection  of  heterogeneous  beta  cell  subpopulations  was  reported  for  enriched  cell  and  single 
cell  studies.  These  include  four  subpopulations  with  differing  expression  of  ST8SIA1  and  CD9 
[84],  five  subpopulations  defined  by  RBP4,  FFAR4/GPR1 20,  ID1,  ID2,  and  ID3  expression  [80], 
and  subpopulations  characterized  by  ER  stress-associated  [76]  and  oxidative  stress-associ¬ 
ated  genes  [79].  Fltp/CFAP126  expression  has  been  reported  to  distinguish  proliferating  and 
mature  beta  cell  subpopulations  in  mice  [85],  but  single  cell  transcriptome  analyses  failed  to 
detect  this  distinction  in  human  beta  cells  [75,83].  However,  proliferative  and  mature  human 
beta  cells  were  identified  by  single  cell  mass  cytometry  analysis  [86],  suggesting  that  mice  and 
humans  make  use  of  distinct  cell  growth  pathways.  Given  that  each  study  detected  distinct 
beta  cell  subpopulations  with  different  gene  signatures,  it  remains  difficult  to  distinguish 
whether  these  subpopulations  are  functionally  distinct  cells  or  the  result  of  technical  con- 
founders,  such  as  the  time  to  sort  and  enrich  in  a  harsh  cell  sorting  environment,  time  for  cell 
capture,  or  cell  and  transcript  capture  efficiency  [87]. 

Single  Cell  Dissection  of  Islet  Dysfunction  and  T2DM 

Single  cell  transcriptome  analyses  provide  a  fresh  and  agnostic  opportunity  to  investigate  the 
putative  mechanisms  underlying  islet  dysfunction  in  T2DM,  To  date,  single  cell  transcriptome 
profiling  has  been  completed  for  a  total  of  1 831  and  1 970  islet  cells  from  26  ND  and  1 5  T2DM 
donors,  respectively  [75,80,81,83].  Comparison  of  T2DM  and  ND  single  cell  transcriptomes 
suggest  that  specific  alterations  in  islet  cell  type  transcriptomes  underlie  T2DM  pathogenesis 
(Figure  3A).  However,  changes  in  cell  proportions  (Figure  3B),  identity,  and  plasticity  (Figure  3C, 
D)  have  also  been  regarded  as  potential  contributors  to  T2DM  [72,88-92],  Specifically, 
decreases  in  diabetic  beta  cell  mass  were  suggested  to  be  caused  by  reversion  to  endocrine 
progenitor  (hormone-negative)  cells  (Figure  3C)  or  different  islet  cell  types  (Figure  3D)  rather 
than  to  apoptosis.  The  model  of  transformed  beta  cell  identity  remains  controversial.  A  recent 
study  concluded  that  the  observed  magnitude  of  decline  in  beta  cell  numbers  in  T2DM  islets  is 
not  accompanied  by  proportionate  increases  in  cells  exhibiting  frans-differentiation  markers 
or  increases  in  other  islet  cell  types  [93].  Rather,  the  presence  of  endocrine  progenitor-like  cells 
in  T2DM  islets  may  represent  newly  forming  endocrine  cells  [93].  Single  cell  profiling  also  did  not 
identify  transcriptomic  evidence  of  dedifferentiated  or  frans-differentiated  cells  in  T2DM  islets 
(Figure  3C,D)  [75,80,83].  Similar  trends  were  observed  in  whole-islet  RNA-seq  data  upon 
deconvolution,  where  cell  type  proportions  did  not  significantly  vary  between  hypoglycemic 
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Figure  3.  Proposed  Cellular  Mechanisms  Contributing  to  Type  2  Diabetes  Mellitus  (T2DM)  Development. 
(Center)  Cartoon  representation  of  human  islet  cellular  composition.  Studies  have  described  the  following  phenomena:  (A) 
Islet  single  cell  transcriptomic  studies  [75,80,83]  suggest  that  cell  type-specific  changes  in  gene  expression  (depicted  as 
half-shaded  cells)  contribute  to  T2DM  pathogenesis.  These  studies  suggest  that  potential  pathogenic  expression  changes 
occur  in  each  islet  cell  type,  not  just  beta  cells.  (B)  Decreases  in  beta  cell  (in  yellow)  numbers  [25,92,100,101],  thought  to 
precede  islet  dysfunction  and  development  of  insulin  resistance.  (C)  Alterations  in  islet  cellular  identity  may  also  account  for 
islet  failure.  Dedifferentiation  of  islet  cell  types  to  precursor  cell  types  and/or  states  (hexagons)  has  been  proposed  to 
underlie  the  loss  of  beta  cell  mass  and  function  in  T2DM  [88,90-92],  (D)  Similarly,  frans-differentiation  of  islet  cell  types  may 
lead  to  imbalances  in  islet  cell  proportions  and  improper  function  [72,88,89] 


and  hyperglycemic  islets  [76].  Thus,  the  transcriptome  data  to  date  do  not  provide  supporting 
evidence  of  dedifferentiation  in  T2DM  islets. 

Transcriptomes  of  each  cell  type  from  ND  and  T2DM  donors  exhibited  remarkable  correlation 
overall.  However,  specific  changes  in  gene  expression  were  reported  in  T2DM  beta  cells, 
including  reduced  expression  of  INS  [75,80],  genes  important  for  insulin  secretion  ( STX1A )  [75] 
and  beta  cell  proliferation  (FXYD2)  [80,83],  as  well  as  elevated  expression  of  genes  implicated  in 
T2DM  GWAS  ( DLK  and ,  DGKB)  [75].  Transcriptional  differences  were  also  identified  in  T2DM 
alpha  cells,  including  expression  of  CD36  [75,80],  a  crucial  activator  of  the  NLRP3  inflamma- 
some  [94],  and  RGS4,  a  negative  regulator  of  GSIS  [80].  Several  genes  were  dysregulated  in 
T2DM  delta  cell  transcriptomes  [75,83].  However,  the  underlying  biology  of  these  candidates 
remains  undefined,  with  no  association  with  islet  growth  or  function  [83].  Aside  from  these 
encouraging  examples,  these  single  cell  studies  have  not  reached  consensus  regarding 
differentially  expressed  genes  between  T2DM  and  ND  cell  types.  Differences  in  islet  donor 
variability,  islet  isolation  and/or  transport,  and  single  cell  dissociation  and/or  sequencing 
protocols  may  explain  these  inconsistencies  across  studies.  We  expect  that  sampling  thou¬ 
sands  of  single  cells  each  from  hundreds  of  individuals  for  large-scale  meta-analyses  will 
provide  a  more  convergent  list  of  cell  type-specific  genes  and  pathways  disrupted  in  T2DM 
islets.  It  will  also  be  important  for  future  studies  to  profile  cells  from  individuals  at  different  points 
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along  the  T2DM  pathogenesis  spectrum,  such  as  prediabetic  individuals  (5.5<HbA1  c<6.0)  to 
identify  and  distinguish  primary  from  secondary  genomic  changes  that  may  be  the  cause  or 
consequence  of  progression  to  T2DM. 

Concluding  Remarks  and  Future  Directions 

The  past  few  years  have  marked  exciting  developments  in  our  understanding  of  the  underlying 
genomic,  environmental,  and  cellular  components  driving  T2DM  pathogenesis.  Numerous 
common  (and  only  few  rare)  genetic  SNVs  have  been  implicated  in  T2DM  progression  [1 3,1 4],  It 
is  unclear  whether  the  ‘missing  T2DM  heritability’  is  explained  by  a  larger  distribution  of 
common  SNVs  with  minimal  effect  sizes,  whether  current  methods  have  missed  critical  rare 
SNVs,  or  whether  it  will  be  captured  by  gene-gene  and  gene-environment  interactions  (such  as 
detected  by  reQTL).  Thus  far,  most  catalogued  T2DM-SNVs  occur  in,  and  disrupt,  islet  RE 
function;  however,  the  causal  connections  between  the  two  remain  challenging  to  decipher. 
eQTL  and  chromatin  accessibility  QTL  (caQTL)  [95,96]  studies  have  been,  and  will  continue  to 
be,  essential  for  linking  genetic  variants  to  molecular  phenotypes.  A  subsequent  challenge  will 
be  to  link  these  molecular  effects  to  pathways  [39]  and  (patho)physiological  phenotypes  [97], 

Functional  genomic  studies  have  identified  minimal  overlap  between  islet  eQTLs  and  T2DM- 
SNVs  [1 1 ,31],  suggesting  that  responses  to  environmental  stress  factors  are  key  mediators  of 
T2DM  pathogenesis.  Mouse  models  have  been  instrumental  in  elucidating  the  genetic  and 
molecular  regulation  of  these  responses  and  how  environmental  stressors  influence  islet  (dys) 
function.  However,  observed  differences  between  mice  and  humans  in  islet  morphology, 
composition,  expression,  and  function  remind  us  to  exercise  caution  when  extrapolating 
findings  in  mice  to  human  T2DM.  Studies  comparing  the  genomic  features  of  human  islets 
and  models  are  essential  to  define  conserved  features  and  those  that  require  modification  to 
determine  what  aspects  of  islet  dysfunction  and  T2DM  we  can  model  effectively  and  to  decide 
how  and/or  where  we  should  manipulate  or  humanize  the  mouse  (epi)genome  to  better  model 
human  T2DM.  (Epi)genome  editing  technologies,  such  as  CRISPR/Cas9,  can  then  be  applied 
to  develop  new  cellular  and  animal  models  to  more  effectively  study  islet  phenotypic  changes 
resulting  from  genetic  and  environmental  variation.  We  anticipate  that  these  integrative  geno¬ 
mic  studies  and  techniques  will  also  serve  as  valuable  resources  to  determine  the  underlying 
genetic  changes  and  mechanisms  of  beta  cell  dysfunction  that  lead  to  T1  DM  [98]. 

Rapid  developments  in  single  cell  NGS  technologies  have  renewed  interest  in  the  less-studied 
islet  cell  types.  Deconstructing  the  major  molecular  changes  that  occur  in  each  cell  type  during 
T2DM  progression  has  proven  challenging,  yielding  inconsistent  results  between  studies  due  to 
patient  donor  variability  and  technical  sequencing  artifacts.  This  is  also  likely  the  result  of  limited 
statistical  power.  In  the  future,  it  will  be  interesting  to  perform  meta-analyses  of  available 
transcriptomic  data  to  maximize  our  confidence  of  changes  in  cell  specific  expression  pro¬ 
grams.  Together,  the  innovative  new  genomic  technologies  of  the  past  few  years  will  allow  us  to 
more  precisely  define,  model,  and  manipulate  the  genes  and  pathways  that  have  gone  awry  in 
T2DM,  with  the  ultimate  goal  of  designing  novel  therapeutic  approaches. 
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Outstanding  Questions 

T2DM-associated  GWAS  variants 
explain  only  a  small  portion  of  T2DM 
heritability,  with  rare  variants  showing 
minimal  contribution.  Does  a  long  tail 
of  common  variants  with  small  effect 
sizes  explain  this  missing  heritability? 
Or  are  we  simply  'underpowered'  to 
detect  rare  variants  and  their  contribu¬ 
tion  to  T2DM  heritability? 

What  are  the  genes  targeted  by  T2DM 
GWAS  sequence  variant  (SV)-contain- 
ing  regulatory  elements?  Are  these 
links  context  specific?  Does  the  risk 
allele  enhance  (gain-of-funotion)  or 
repress  (loss-of-function)  RE  function? 

How  do  the  transoriptomes  and/or  epi- 
genomes  of  islets  and  islet  cell  types 
change  when  subjected  to  variable 
environmental  stressors  (oxidative 
stress,  inflammation,  diet,  etc.)?  How 
are  they  changed  by  intrinsic  (aging, 
circadian  rhythms,  etc.)  environmental 
factors?  Which  SVs  regulate  and  alter 
these  islet  responses? 

What  are  the  precise  cellular  and 
molecular  pathophysiological  changes 
in  each  cell  type  that  lead  to  T2DM? 
Are  the  major  pathological  changes 
beta  cell  specific  or  do  they  involve 
other  islet  cell  types  and/or  non-islet 
cell  types? 

How  many  islet  and  single  cell  samples 
must  be  obtained  to  effectively  capture 
combined  cell  type  heterogeneity  while 
controlling  for  technical  and  experi¬ 
mental  confounders?  How  many  sam¬ 
ples  are  needed  to  observe  genetic 
and/or  epigenetic  differences  between 
T2DM  and  ND  states?  Would  stratifi¬ 
cation  of  islets  by  T2DM  risk  genotype 
improve  cell  type-specific  T2DM 
signatures? 
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Single-cell  transcriptomes  identify  human  islet  cell 
signatures  and  reveal  cell-type-specific  expression 
changes  in  type  2  diabetes 

Nathan  Lawlor,1,4  Joshy  George,1,4  Mohan  Bolisetty,1  Romy  Kursawe,1  Lili  Sun,1 
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Blood  glucose  levels  are  tightly  controlled  by  the  coordinated  action  of  at  least  four  cell  types  constituting  pancreatic  islets. 
Changes  in  the  proportion  and/or  function  of  these  cells  are  associated  with  genetic  and  molecular  pathophysiology  of 
monogenic,  type  I,  and  type  2  (T2D)  diabetes.  Cellular  heterogeneity  impedes  precise  understanding  of  the  molecular  com¬ 
ponents  of  each  islet  cell  type  that  govern  islet  (dys)function,  particularly  the  less  abundant  delta  and  gamma /pancreatic 
polypeptide  (PP)  cells.  Here,  we  report  single-cell  transcriptomes  for  638  cells  from  nondiabetic  (ND)  and  T2D  human  islet 
samples.  Analyses  of  ND  single-cell  transcriptomes  identified  distinct  alpha,  beta,  delta,  and  PP/gamma  cell-type  signatures. 
Genes  linked  to  rare  and  common  forms  of  islet  dysfunction  and  diabetes  were  expressed  in  the  delta  and  PP/gamma  cell 
types.  Moreover,  this  study  revealed  that  delta  cells  specifically  express  receptors  that  receive  and  coordinate  systemic  cues 
from  the  Ieptin,  ghrelin,  and  dopamine  signaling  pathways  implicating  them  as  integrators  of  central  and  peripheral  met¬ 
abolic  signals  into  the  pancreatic  islet.  Finally,  single-cell  transcriptome  profiling  revealed  genes  differentially  regulated  be¬ 
tween  T2D  and  ND  alpha,  beta,  and  delta  cells  that  were  undetectable  in  paired  whole  islet  analyses.  This  study  thus 
identifies  fundamental  cell-type-specific  features  of  pancreatic  islet  (dys)function  and  provides  a  critical  resource  for  com¬ 
prehensive  understanding  of  islet  biology  and  diabetes  pathogenesis. 

[Supplemental  material  is  available  for  this  article.] 


Pancreatic  islets  of  Langerhans  are  clusters  of  at  least  four  different 
hormone-secreting  endocrine  cell  types  that  elicit  coordinated — 
but  distinct — responses  to  maintain  glucose  homeostasis.  As 
such,  they  are  central  to  diabetes  pathophysiology.  On  average, 
human  islets  consist  mostly  of  beta  (54%),  alpha  (35%),  and  delta 
(11%)  cells;  up  to  a  few  percent  gamma/pancreatic  polypeptide 
(PP)  cells;  and  very  few  epsilon  cells  (Brissova  et  al.  2005;  Cabrera 
et  al.  2006;  Blodgett  et  al.  2015).  Human  islet  composition  is  nei¬ 
ther  uniform  nor  static  but  varies  between  individuals  and  across 
regions  of  the  pancreas  (Brissova  et  al.  2005;  Cabrera  et  al.  2006; 
Blodgett  et  al.  2015).  Cellular  heterogeneity  complicates  molecular 
studies  of  whole  human  islets  and  may  mask  important  role(s) 
for  less  common  cells  in  the  population  (Dorrell  et  al.  2011b; 
Bramswig  et  al.  2013;  Nica  et  al.  2013;  Blodgett  et  al.  2015;  Liu 
and  Trapnell  2016).  Moreover,  it  complicates  attempts  to  identify 
epigenetic  and  transcriptional  signatures  distinguishing  diabetic 
from  nondiabetic  (ND)  islets,  leading  to  inconsistent  reports  of 
genes  and  pathways  affected  (Gunton  et  al.  2005;  Marselli  et  al. 
2010;  Taneera  et  al.  2012;  Dayeh  et  al.  2014).  Conventional  sorting 
and  enrichment  techniques  are  unable  to  specifically  purify  each 
human  islet  cell  type  (Dorrell  et  al.  2008;  Nica  et  al.  2013; 
Bramswig  et  al.  2013;  Hrvatin  et  al.  2014;  Blodgett  et  al.  2015), 
thus  a  precise  understanding  of  the  transcriptional  repertoire  gov- 
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erning  each  cell  type's  identity  and  function  is  lacking.  Identifying 
the  cell-type-specific  expression  programs  that  contribute  to  islet 
dysfunction  and  type  2  diabetes  (T2D)  should  reveal  novel  targets 
and  approaches  to  prevent,  monitor,  and  treat  T2D. 

In  this  study,  we  sought  to  decipher  the  transcriptional  reper¬ 
toire  of  each  islet  cell  type  in  an  agnostic  and  precise  manner  by 
capturing  and  profiling  pancreatic  single  cells  from  ND  and  T2D 
individuals.  From  these  profiles,  we  identified  transcripts  uniquely 
important  for  each  islet  cell  type's  identity  and  function.  Finally, 
we  compared  T2D  and  ND  individuals  to  identify  islet  cell-type- 
specific  expression  changes  that  were  otherwise  masked  by  islet 
cellular  heterogeneity.  The  insights  and  data  from  this  study  pro¬ 
vide  an  important  foundation  to  guide  future  genomics-based  in¬ 
terrogation  of  islet  dysfunction  and  diabetes. 

Results 

Islet  single-cell  transcriptomes  accurately  recapitulate  those 
of  intact  islets 

Pancreatic  islets  (>85%  purity  and  >90%  viability)  were  obtained 
from  eight  human  cadaveric  organ  donors  (five  ND,  three  T2D) 
(Fig.  1A;  Supplemental  Table  SI).  Each  islet  sample  was  processed 
to  generate  single-cell  RNA-seq  libraries  (Fig.  1A;  single  cell)  and 
paired  bulk  RNA-seq  libraries  at  three  different  stages  of  islet  pro¬ 
cessing  (Fig.  1A;  baseline,  intact,  and  dissociated).  All  RNA-seq 

©  201 7  Lawlor  et  al.  This  article,  published  in  Genome  Research,  is  available  un¬ 
der  a  Creative  Commons  License  (Attribution  4.0  International),  as  described  at 
http://creativecommons.Org/licenses/by/4.0/. 
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Figure  1.  Single-cell  transcriptomes  reflect  those  of  paired  intact  islets.  (A)  Schematic  of  experimental  workflow.  Islets  from  each  donor  sample  (n  =  8 
individuals)  were  dissociated  using  Accutase,  and  single-cell  transcriptomes  were  synthesized  from  1 050  cells  captured  using  1 1  Fluidigm  Cl  chips.  In  par¬ 
allel,  "bulk"  RNA-seq  libraries  were  prepared  from  remaining  dissociated  single  cells  (dissociated)  and  from  intact  islets  either  flash  frozen  (baseline)  or  in¬ 
cubated/processed  (intact).  (8)  Unsupervised  hierarchical  clustering  of  baseline,  intact,  and  dissociated  islet  transcriptomes  demonstrates  clustering  by 
person  and  not  by  processing/experimental  condition.  (C)  Histogram  demonstrating  the  number  of  genes  detected  in  each  single  cell.  Cells  expressing 
less  than  3500  genes  (n=  72)  were  removed  from  downstream  analyses.  (D)  Scatter  plot  comparing  intact  islet  bulk  RNA-seq  (n  =  8)  and  ensemble  sin¬ 
gle-cell  RNA-seq  (n  =  97 8)  data  demonstrates  high  correlation,  (fi2)  Pearson's  R-squared;  (TPM)  transcripts  per  million;  (P)  person. 


methods  employed  SMARTer  chemistry  (Methods),  and  bulk  islet 
cDNA  libraries  were  sequenced  to  an  average  approximate  depth 
of  34  million  reads  (Supplemental  Table  S2).  Baseline,  intact,  and 
dissociated  transcriptomes  from  each  person  were  highly  correlat¬ 
ed  (Supplemental  Fig.  SI).  Transcriptomes  clustered  by  donor  and 
not  by  processing  condition  or  incubation  time  (Fig.  IB),  strongly 
suggesting  that  islet  processing  did  not  significantly  alter  islet 
transcriptomes. 

A  total  of  1050  islet  cells  (622  ND  and  428  T2D)  were  captured 
on  11  Fluidigm  Cl  chips.  cDNA  libraries  were  constructed  from 


captured  cells  and  barcoded,  fragmented,  pooled,  and  sequenced 
to  an  average  depth  of  3  million  reads  (Supplemental  Table  S2). 
Two  separate  library  preparations  from  the  same  amplified  cDNA 
for  each  of  83  single  cells  demonstrated  remarkable  correlation, 
suggesting  minimal  batch  effects  resulting  from  the  cDNA  process¬ 
ing  and  sequencing  steps.  Resequenced  samples  are  highlighted  in 
Supplemental  Table  S2  but  were  not  included  in  subsequent  anal¬ 
yses.  Transcript  coverage  is  indicated  in  Supplemental  Figure  S2. 
Approximately  81%  (21,484/26,616)  of  protein-coding  genes 
and  long  intergenic  noncoding  RNAs  (lincRNAs)  were  detected 
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in  at  least  one  cell  from  the  collection.  On  average,  each  single  cell 
expressed  5944  genes  (Fig.  1C).  Cells  expressing  less  than  3500 
genes  (n  =  72)  also  exhibited  high  mitochondrial  alignment  rates 
and  other  reported  transcriptional  metrics  of  cell  death  and/or 
poor  quality  (Ilicic  et  al.  2016;  Xin  et  al.  2016)  and  were  removed 
from  subsequent  analyses  (Fig.  1C). 

We  next  assessed  the  extent  to  which  the  remaining  978  sin¬ 
gle-cell  transcriptomes  represent  the  expression  patterns  observed 
in  intact  islets.  Single-cell  transcriptome  ensembles  from  each  per¬ 
son  were  highly  correlated  (Pearson's  R2  ranged  from  0.91-0.98) 
(Supplemental  Fig.  S3),  regardless  of  disease  state.  Pearson's  R2  val¬ 
ues  between  individuals'  single-cell  ensembles  and  corresponding 
"bulk"  transcriptomes  ranged  from  0.75-0.86  (Supplemental  Fig. 
S4)  and  did  not  differ  substantially  between  ND  (R2  =  0.87)  and 
T2D  (R2  =  0.85)  samples  (Supplemental  Fig.  S5).  Overall,  ensem¬ 
ble/aggregate  single-cell  transcriptome  profiles  correlated  well 
with  those  of  pooled  bulk  islet  transcriptomes  from  all  individuals 
(Fig.  ID,  R2  =  0.87).  These  results  suggest  that  the  islet  single-cell 
transcriptomes  are  high  quality  and  effectively  reflect  bulk  islet 
transcriptomes. 

Single-cell  profiling  captures  transcriptomes  of  major  and  minor 
pancreatic  endocrine  and  exocrine  cell  types 

Five  islet  endocrine  cell  types  have  been  assigned  based  on  exclu¬ 
sive  and  robust  expression  of  the  peptide  hormone  genes  INS 
(beta),  GCG  (alpha),  SST  (delta),  PPY  (PP/gamma),  and  GHRL  (epsi¬ 
lon)  (Baetens  et  al.  1979;  Nussey  and  Whitehead  2001;  Zhao  et  al. 
2008;  Li  et  al.  2016;  Xin  et  al.  2016;  Wang  et  al.  2016).  The  pancre¬ 
as  also  contains  three  exocrine  cell  types — acinar,  stellate,  and  duc¬ 
tal — that  critically  support  digestion  through  synthesis  and 
transport  of  digestive  enzymes  (Pandol  2011;  Reichert  and  Rustgi 
2011).  Each  also  has  been  identified  by  specific  marker  gene  ex¬ 
pression,  including  the  serine  peptidase  gene  PRSS1  (acinar) 
(Dabbs  2013),  the  extracellular  matrix  protein  gene  COL1A1  (stel¬ 
late)  (Mathison  et  al.  2010),  and  the  structural  keratin  gene  KRT19 
(ductal)  (Dorrell  et  al.  2008,  2011a, b;  Reichert  and  Rustgi  2011). 
We  used  these  marker  genes  to  determine  the  representation  of 
each  islet  cell  type  among  our  978  profiled  single  cells. 

Density  plots  (Fig.  2A)  revealed  bimodal  expression  of  each 
marker  gene  across  the  population  of  single  cells.  Therefore,  we 
employed  Gaussian  mixture  modeling  (GMM)  to  classify  the  cells 
unambiguously  (Fig.  2B).  Approximate  log2  counts  per  million 
(CPM)  thresholds  for  each  marker  gene  used  to  classify  cell  types 
are  provided  in  Supplemental  Table  S3.  This  approach  identified 
617  single  cells  ( — 63%)  from  T2D  and  ND  islets  expressing  a  single 
marker  gene  representative  of  each  major  endocrine  and  exocrine 
cell  type,  examples  of  which  are  shown  in  Figure  2C.  This  included 
239  alpha,  264  beta,  25  delta,  and  18  PP/gamma  cells  (Table  1);  the 
proportions  of  each  cell  type  are  in  the  ranges  previously  reported 
(Brissova  et  al.  2005;  Cabrera  et  al.  2006;  Blodgett  et  al.  2015).  Only 
one  cell  expressing  high  levels  (log2CPM  >  15)  of  GHRL  was  identi¬ 
fied,  which  we  presume  to  be  an  exceedingly  rare  epsilon  cell. 
Additionally,  we  obtained  19  stellate,  24  acinar,  and  27  ductal  cells 
(Table  1),  presumably  from  exocrine  contamination  of  the  islet  cell 
preparations.  Only  21  cells  (~2%)  expressed  none  of  the  specified 
marker  genes  (Table  1).  Approximately  one-third  (340/978)  of  cells 
expressed  more  than  one  marker  gene;  these  were  removed  from 
subsequent  analysis  due  to  concerns  that  these  represent  two  ver¬ 
tically  stacked  cells  in  a  given  capture  site  (for  details,  see 
Methods) .  Similar  ratios  of  potential  stacked  cells  have  been  report¬ 
ed  in  other  studies  using  the  Fluidigm  Cl  platform  to  capture 


mouse  (Xin  et  al.  2016)  and  human  islet  cells  (Wang  et  al.  2016). 
Collectively,  these  single-cell  data  capture  transcriptome  profiles 
representing  each  of  the  major  and  minor  pancreatic  endocrine 
and  exocrine  cell  types.  Genome  Browser  tracks  representing  ag¬ 
gregate  single-cell  expression  for  each  islet  cell  type  have  been  gen¬ 
erated  using  HOMER  (Heinz  et  al.  2010)  and  are  made  available 
(see  Data  Access)  to  facilitate  their  use  and  investigation  by  mem¬ 
bers  of  the  islet  biology  and  diabetes  research  communities. 

Unsupervised  analyses  of  islet  single-cell  transcriptomes  identify 
discrete  clusters  corresponding  to  cell  type 

To  determine  if  and  how  islet  cell  transcriptomes  cluster,  we 
completed  unsupervised  dimensionality  reduction  via  t-distribu- 
ted  stochastic  neighbor  embedding  (t-SNE)  on  380  ND  single-cell 
samples  (excluding  "multiple”  labeled  samples).  t-SNE  assembled 
single-cell  transcriptomes  into  discrete  clusters  based  upon  1824 
highly  expressed  genes  (see  Methods;  Supplemental  Table  S4); 
GMM-based  marker  gene  analysis  revealed  that  each  cluster  corre¬ 
sponded  to  a  distinct  endocrine  and  exocrine  cell  type  (Fig.  3A; 
Supplemental  Fig.  S6).  Unsupervised  hierarchical  clustering  also 
grouped  single-cell  transcriptomes  into  discrete  cell  types  (Fig. 
3B).  Despite  being  obtained  from  different  individuals,  161/168 
beta,  128/138  alpha,  15/16  delta,  and  12/12  PP/gamma  cell  tran¬ 
scriptomes  clustered  into  the  same  dendrogram  branches,  strongly 
suggesting  that  cell  type  encodes  the  greatest  variation  in  the  data. 
Exocrine  cells  and  those  expressing  none  of  the  specified  marker 
genes  ("none")  clustered  separately  from  the  endocrine  cell  types. 
Importantly,  this  clustering  was  driven  by  neither  sequencing 
depth  (Supplemental  Fig.  S7B)  nor  expression  of  classic  marker 
genes  (INS,  GCG,  SST,  PPY,  GHRL,  COL1A1,  PRSS1,  and  KRT19),  as 
cells  continued  to  cluster  into  discrete  cell  types  even  when  all 
marker  genes  were  removed  from  the  expression  data  sets 
(Supplemental  Figs.  S7C,  S8).  Recent  studies  have  reported  hetero¬ 
geneity  among  beta  cells.  Specifically,  Dorrell  et  al.  characterized 
four  subpopulations  of  human  beta  cells  based  on  differing 
ST8SIA1  and  CD9  expression  (Dorrell  et  al.  2016).  Similarly,  Bader 
et  al.  2016  distinguished  two  populations  of  proliferating  (Fltp*) 
and  mature  (Fltp~)  mouse  beta  cells.  We  did  not  find  evidence  of 
beta  cell  subpopulations  (Supplemental  Fig.  S9),  nor  did  we  identify 
numerous  proliferating  cells  (Supplemental  Table  S5).  T2D  single¬ 
cell  transcriptomes  (n  =  258)  also  demonstrated  clear  clustering 
by  cell  type  in  unsupervised  analyses  (Supplemental  Figs.  S10- 
S14)  based  on  1908  highly  expressed  genes  (Supplemental  Table 
S4).  Thus,  each  endocrine  and  exocrine  pancreatic  cell  type 
exhibits  a  complex  characteristic  expression  signature  that  unique¬ 
ly  identifies  it. 

Differential  expression  analyses  reveal  islet  cell-type-specific 
transcriptional  signatures 

To  identify  gene  signatures  distinguishing  each  islet  cell  type,  we 
completed  a  series  of  pairwise  differential  expression  analyses 
(Supplemental  Table  S6)  between  each  cell  type  (see  Methods). 
After  intersecting  the  results  from  each  pairwise  comparison,  we 
identified  a  conservative  collection  of  154  islet  endocrine  cell- 
type  "signature"  genes  (61  beta,  51  alpha,  17  delta,  25  gamma), 
as  well  as  202  exocrine  genes  (109  stellate,  31  acinar,  62  ductal) 
at  5%  false-discovery  rate  (FDR)  (Fig.  3C;  Supplemental  Table  S7). 
Two  genes  exhibited  overlap  between  the  endocrine  and  exocrine 
signature  lists:  FAP  (alpha  and  stellate  cell  overlap)  and  TNS1  (beta 
and  stellate  cell  overlap).  Gene  set  enrichment  analysis  (GSEA) 
identified  enrichment  (FDR-adjusted  P-value  <0.05)  of  insulin 
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Figure  2.  Cell-type  classification  based  on  marker  gene  expression.  (4)  Density  plots  demonstrating  endocrine  and  exocrine  marker  gene  expression 
across  all  single  cells.  (B)  Schematic  of  the  Gaussian  mixture  model  method  applied  to  assign  cell-type  identity  based  on  marker  gene  expression.  (C) 
UCSC  Genome  Browser  views  of  representative  single-cell  expression  profiles  of  INS,  GCG,  SST,  PPY,  and  CHRL  genes  encoding  beta,  alpha,  delta,  PP/gam- 
ma,  and  epsilon  cell  hormones  of  the  endocrine  pancreas,  respectively,  and  marker  genes  for  stellate  ( COL1A1 ),  acinar  ( PRSS1 ),  and  ductal  (KRTI 9)  cells  of 
the  exocrine  pancreas.  Line  colors  indicate  putative  beta  (red),  alpha  (blue),  delta  (green),  PP/gamma  (purple),  epsilon  (orange),  stellate  (black),  acinar 
(dark  gray),  and  ductal  cells  (light  gray).  (PP)  pancreatic  polypeptide;  (CPM)  counts  per  million. 


signaling,  oxidative  phosphorylation,  maturity-onset  diabetes 
of  the  young  (MODY),  and  glycolysis/gluconeogenesis  KEGG 
pathways  in  beta  cells  relative  to  the  other  endocrine  cells 
(Supplemental  Table  S8). 

Signature  genes  included  previously  reported  beta-specific 
genes  like  NKX6-1,  DLK1,  and  ADCYAP1  (Fig.  3C,  right)  and  alpha 
cell-specific  genes  like  IRX2,  LOXL4,  and  DPP4,  a  cell  surface  re¬ 
ceptor  and  diabetes  drug  target  (Dorrell  et  al.  2011a;  Bramswig 
et  al.  2013;  Nica  et  al.  2013;  Blodgett  et  al.  2015).  Among  delta 


cell  signature  genes,  we  detected  exclusive  expression  of  HHEX, 
a  transcription  factor  reported  to  govern  delta  cell  identity  and 
function  and  linked  to  T2D  GWAS  (Zhang  et  al.  2014).  Delta  cells 
also  specifically  expressed  BCHE,  which  encodes  butyrylcholines- 
terase.  BCHE  catalyzes  the  breakdown  of  acetylcholine  and  ghre- 
lin  (Chen  et  al.  2015),  thus  providing  a  mechanism  for  delta 
cells  to  exert  local  inhibition  of  islet-influencing  endocrine  sig¬ 
nals.  PP/gamma  cell-specific  transcriptomes  included  CTD- 
2008P7.8,  a  lincRNA  of  unknown  function;  CNTNAP5,  a  member 
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Table  1.  Number  of  profiled  cells  for  each  pancreatic  cell  type  based 
on  marker  gene  expression 


Putative  cell 
type  (marker 
gene) 

Cell  ontology 
accession  no. 

Nondiabetic 

(ND) 

Type  2 
diabetic 
(T2D) 

Alpha  (CCC) 

CL00001  71 

1  38  (23.47%) 

101  (25.9%) 

Beta  (INS) 

CL:00001 69 

168  (28.57%) 

96  (24.62%) 

Delta  (SST) 

CL00001  73 

16  (2.72%) 

9  (2.31%) 

PP/gamma  (PPY) 

CL0002275 

1 2  (2.04%) 

6  (1 .54%) 

Epsilon  (CHRL) 

CL000501 9 

1  (0.17%) 

0 

Stellate  (COL1A1) 

CL:000241 0 

9  (1.53%) 

1 0  (2.56%) 

Acinar  (PRSS7) 

CL:0002064 

1 5  (2.55%) 

9  (2.31%) 

Ductal  (KRTI9) 

CL0002079 

1 1  (1 .87%) 

16  (4.1%) 

Multiple 

— 

208  (35.37%) 

132 

(33.85%) 

None  (other) 

— 

10  (1.7%) 

1 1  (2.82%) 

Total 

588 

390 

of  the  neurexin  family  of  cell  adhesion  molecules;  and  ID4,  which 
encodes  an  inhibitor  of  DNA-binding  protein.  In  addition  to 
DPP4,  we  detected  30  islet  signature  genes  whose  proteins 
SWISSPROT  predicts  to  localize  to  the  cell  surface  (Supplemental 
Table  S9).  DPP4  antibodies  have  recently  been  used  to  isolate 
purer  alpha  cell  populations  from  islets  (Arda  et  al.  2016).  Thus, 
antibodies  against  the  other  candidate  cell-type-specific  surface 
markers  we  have  identified  may  be  useful  to  purify  other  islet 
cell  types. 

Single-cell  profiling  identifies  unexpected  overlap  in  expression 
between  minor  and  major  islet  cell  types 

Cell  sorting  and  enrichment  methods  such  as  fluorescence-activat¬ 
ed  cell  sorting  (FACS)  have  been  used  to  identify  characteristic  al¬ 
pha  and  beta  cell  genes  (Dorrell  et  al.  201  la, b;  Bramswig  et  al. 
2013;  Nica  et  al.  2013;  Blodgett  et  al.  2015).  However,  expression 
of  SST  or  PPY  in  the  reported  alpha  and  beta  cell  gene  sets  suggests 
the  presence  of  the  less  abundant  delta  and  PP/gamma  islet  cell 
types  in  the  enriched  cell  preparations.  To  distinguish  genes  exhib¬ 
iting  alpha-  and  beta-specific  gene  expression  from  those  ex¬ 
pressed  also  in  delta  and  PP/gamma  cells,  we  investigated  the 
expression  of  previously  reported  alpha-  and  beta-specific  genes 
(Supplemental  Table  S10;  Supplemental  Fig.  S15)  in  our  ND  endo¬ 
crine  single-cell  transcriptomes.  Only  1 15/1683  previously  report¬ 
ed  beta-specific  genes  were  expressed  greater  than  fourfold  higher 
in  beta  cells  relative  to  the  other  endocrine  cells  (FDR  <  0.05;  one¬ 
way  ANOVA  followed  by  Tukey's  honest  significant  difference 
[THSD])  (Fig.  3D).  Similarly,  75/1853  reported  alpha-specific  genes 
were  alpha  cell  enriched  (Fig.  3E).  Several  genes  previously  report¬ 
ed  to  be  enriched  in  the  major  islet  cell  types,  such  as  MAFA, 
SLC2A2,  SIX3,  and  DLK1  in  beta  cells  and  IRX2,  DPP4,  and 
ADORA2A  in  alpha  cells,  were  confirmed  to  be  signature  genes. 
Surprisingly,  we  found  that  37  and  33  reported  beta-  and  alpha- 
specific  genes  were  also  expressed  in  delta  and  PP/gamma  cells,  re¬ 
spectively  (Fig.  3F;  Supplemental  Table  S10).  Notable  examples  in¬ 
cluded  beta  and  delta  cell  expression  of  the  congenital 
hyperinsulinemia  (CHI)  gene  HADH  and  alpha  and  PP/gamma 
cell  expression  of  the  ARX  transcription  factor  (Liu  et  al.  2011). 
HADH  is  typically  associated  with  beta  cell  expression  and,  when 
mutated,  leads  to  insulin  hypersecretion  and  CHI  (Kapoor  et  al. 
2010;  Pepin  et  al.  2010);  these  data  implicate  the  delta  cell  in  the 
molecular  genetics  of  CHI.  Misexpression  of  ARX  has  been  shown 
to  convey  both  alpha  and  PP/gamma  cell  features  to  cells 


(Collombat  et  al.  2007),  suggesting  that  its  expression  in  each 
cell  type  is  important  for  identity  and  function. 

Genes  underpinning  metabolic  function,  regulation  of  energy 
homeostasis,  and  satiety  are  specific  to  distinct  islet  cell  types 

Perturbations  in  genes  involved  in  glucose  sensing  and  proper 
maintenance  of  blood  glucose  levels  contribute  to  T2D  pathophys¬ 
iology  (Schuit  et  al.  2001;  MacDonald  et  al.  2005).  Beta  cells  regu¬ 
late  blood  glucose  through  the  secretion  of  insulin  and  are  thus 
exquisitely  sensitive  to  blood  glucose  levels.  Glucose-stimulated 
insulin  secretion  (GSIS)  is  linked  to  universal  basic  pathways  of  cel¬ 
lular  metabolism  in  beta  cells.  To  gain  insight  into  beta  cell-type- 
specific  transcriptomic  features  associated  with  GSIS,  namely,  glu¬ 
cose  uptake  and  glycolysis,  we  examined  the  expression  of  relevant 
genes  in  our  islet  single-cell  transcriptomes  (Fig.  4A). 

GSIS  pathway  genes  associated  with  glucose  sensing  and  up¬ 
take  displayed  highly  beta  cell-specific  expression,  including 
SLC2A2,  which  encodes  the  glucose  transporter  GLUT2;  G6PC2, 
which  encodes  a  subunit  of  glucose-6-phosphatase;  and  PFKFB2, 
which  encodes  an  enzyme  involved  in  regulation  of  glycolysis 
(Fig.  4A;  Chen  et  al.  2008;  Muller  et  al.  2015).  While  expressed  in 
all  cell  types,  the  enzyme,  ALDOA1,  immediately  downstream 
from  PFK1  and  associated  with  the  glycerol  phosphate  (GP)  shut¬ 
tle,  is  enriched  in  beta  cells,  perhaps  reflecting  an  additional  point 
of  GSIS  control.  Protein-coding  genes  for  five  subsequent  glycolyt¬ 
ic  enzymatic  steps  from  glyceraldehyde-3-phosphate  to  pyruvate 
were  not  significantly  differentially  expressed  between  cell  types. 
Beta  cells  are  known  to  be  limited  in  their  ability  to  produce  lactate 
from  pyruvate  (Fridlyand  and  Philipson  2010);  this  is  reflected  by 
high  LDHB/LDHA  ratios  that  favor  the  lactate  to  pyruvate  flux  in 
beta  cells. 

The  glycerol-3-phosphate  shuttle  allows  NAD+  regeneration 
in  the  cytosol  to  sustain  glycolytic  flux  essential  for  GSIS. 
Cytoplasmic  NAD+  generation  has  been  shown  to  be  essential 
for  GSIS  (Eto  et  al.  1999).  Both  components  of  the  glycerol-3-phos- 
phate  shuttle,  cytoplasmic  GPD1  and  mitochondrial  GPD2,  were 
expressed  in  beta  cells,  with  the  former  representing  a  beta  cell  sig¬ 
nature  gene  (Fig.  4A).  Additionally,  we  identified  the  mitochondri¬ 
al  solute  transporter  SLC25A34  as  beta  cell  specific.  While  its 
transport  specificities  have  yet  to  be  determined,  the  closest  yeast 
ortholog  of  SLC25A34,  Oaclp/YKL120w  (Palmieri  et  al.  1999; 
Marobbio  et  al.  2008),  is  thought  to  import  oxaloacetate  into  the 
mitochondria.  This  is  particularly  intriguing  considering  our 
data  and  others  (MacDonald  et  al.  2011)  show  the  complete  ab¬ 
sence  of  pyruvate  carboxylase  (PC)  expression  in  human  beta  cells, 
despite  the  essential  role  PC  is  known  to  play  in  rodent  GSIS 
(Sugden  and  Holness  2011)  through  mitochondrial  production 
of  oxaloacetate.  We  hypothesize  that  SLC25A34  may  provide  an 
alternate,  cytoplasmic  source  for  mitochondrial  oxaloacetate  in 
the  human  beta  cell. 

Single-cell  profiling  also  allowed  us  to  interrogate  the  tran¬ 
scriptional  repertoire  of  less  abundant  delta  and  PP/gamma  cell 
types,  which  have  been  elusive  in  both  whole  islet  and  sorted  islet 
studies.  While  it  is  difficult  to  determine  epsilon  cell  expression 
signatures  with  one  ghrelin-positive  cell,  our  ND  data  set  includes 
16  delta  cells  and  12  PP/gamma  cells.  Among  the  top  100  differen¬ 
tially  expressed  (FDR<  5%)  genes  in  delta  versus  other  islet  endo¬ 
crine  cells  are  receptors  for  the  appetite-regulating  hormones 
leptin  ( LEPR )  and  ghrelin  ( GHSR ),  the  growth  factor  neuregulin  4 
(ERBB4),  and  the  neurotransmitter  dopamine  ( DRD2 )  (Fig.  4B). 
GHSR  has  recently  been  shown  to  be  specifically  expressed  and 
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Figure  3.  Statistical  analysis  of  nondiabetic  single-cell  transcriptomes  identifies  cell-type-specific  clusters  and  defines  the  signature  genes  of  each  islet  cell 
type.  (A)  Unsupervised  analysis  of  single-cell  transcriptomes  using  t-distributed  stochastic  neighbor  embedding  (t-SNE)  demonstrates  grouping  of  single 
islet  cell  transcriptomes  into  the  major  constituent  cell  types.  Respective  cell  labels  and  coloring  were  added  after  unsupervised  analyses.  (B)  Unsupervised 
hierarchical  clustering  illustrates  relationships  of  transcriptome  profiles  between  respective  endocrine  and  exocrine  cells.  (C)  Supervised  differential  expres¬ 
sion  analysis  of  cell  types  determines  cell-specific  (signature)  genes  across  all  cells  (see  Methods).  Values  represent  log2(CPM)  expression  after  mean-cen¬ 
tering  and  scaling  between  -1  and  1 .  Violin  plots  of  selected  signature  gene  expression  are  displayed  to  the  right  of  the  heatmap.  ( D,E )  Bar  plots  depicting 
the  numbers  of  previously  reported  beta-specific  (D)  and  alpha-specific  (£)  genes  (Dorrell  etal.  201 1  b;  Bramswig  et  al.  201  3;  Nica  etal.  201  3;  Blodgett  et  al. 
2015)  confirmed  to  be  expressed  in  each  islet  cell  type  after  ANOVA  and  Tukey's  honest  significant  difference  (THSD)  post-hoc  analysis  (Methods).  (F) 
Several  beta-specific  genes  demonstrate  similar  expression  levels  in  delta  cells,  and  alpha-specific  genes  demonstrate  similar  expression  in  PP/gamma  cells. 
Values  represent  average  log2(CPM)  expression  after  mean-centering  and  scaling  between  -1  and  1 .  (p)  Beta;  (a)  alpha;  (5)  delta;  (y)  PP/gamma  cells. 


functional  in  both  human  and  mouse  delta  cells,  reducing  GSIS  in 
human  and  mouse  beta  cells  when  induced  (DiGruccio  et  al. 
2016).  LEPR,  DRD2,  and  ERBB4  expression  is  specific  to  human 
delta  cells.  In  situ  analyses  (ViewRNA,  Affymetrix)  detected  coex¬ 
pression  of  LEPR  in  79/102  (77%)  of  SST-expressing  cells  (Fig. 
4D,  arrowheads)  in  ND  islets,  confirming  the  delta  cell-specific  ex¬ 
pression  detected  in  Fluidigm  Cl  profiling.  Thus,  our  data  suggest 


intriguing  roles  for  islet  delta  cells  in  the  integration  of  metabolic 
signals  via  Ieptin,  ghrelin,  and  dopamine  signaling  pathways. 

PP/gamma,  along  with  epsilon  cells,  are  among  the  least  stud¬ 
ied  islet  cell  types  due  to  their  scarcity  in  islets.  Recent  studies  show 
that  PP/gamma  cells  are  crucial  regulators  of  energy  homeostasis 
(Yulyaningsih  et  al.  2014;  Khandekar  et  al.  2015).  In  response  to 
food  intake,  these  cells  secrete  the  anorexigenic  hormone  PPY  to 
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Figure  4.  Cell-type-specific  expression  of  metabolic,  signaling,  and  diabetes  trait  genes.  ( A )  Beta  cell-specific  expression  of  different  isoforms  of  glyco¬ 
lytic  and  metabolic  intermediate  shuttles.  Genes  marked  with  an  asterisk  represent  beta  cell  signature  genes.  (B)  Delta  cell-specific  expression  of  neuroac- 
tive-ligand  receptors  and  transcription  factors.  (I)  Bulk  intact  islets;  (p)  beta;  (a)  alpha;  (S)  delta;  (y)  PP/gamma;  (A)  acinar;  (D)  ductal;  (S)  stellate  cells.  (C) 
Monogenic  diabetes-associated  genes  and  their  cell-type-specific  expression  in  islets.  Violin  plots  show  the  log2(CPM)  expression  of  each  gene  across  cell 
types.  (CHI)  congenital  hyperinsulinism;  (MODY)  maturity  onset  diabetes  of  the  young;  (TNDM)  transient  neonatal  diabetes  mellitus;  (PNDM)  permanent 
neonatal  diabetes  mellitus.  (D)  RNA  in  situ  hybridization  (ViewRNA,  Affymetrix)  of  OCT-embedded  islet  sections  from  donor  P3  labeling  SST  (red),  LEPR 
(green),  and  nuclei  (DAPI;  blue).  White  arrowheads  indicate  SST*/ LEPR*  cells.  ViewRNA  of  OCT-embedded  islet  sections  from  donor  P4  to  detect  the  fol¬ 
lowing:  (£)  INS  (red),  HADH  (green),  and  nuclei  (DAPI;  blue)  and  (F)  S57"(red),  HADH  (green),  and  nuclei  (DAPI;  blue).  White  arrowheads  highlight  examples 
of  HADH*/INS~  (£)  and  HADH*/ SST*  (F)  cells.  Hollow  arrowheads  highlight  HADH*/ INS*  (£)  and  HADH*/SST~  (F)  cells.  In  D-F,  solid  horizontal  white  lines 
indicate  scale  bars  of  20  |cm.  In  £and  F,  white  dashed  lines  highlight  a  cell  either  co-expressing  (£)  INS/HADH  or  (F)  SST/HADH.  White  squares  in  the  bottom 
left  of  £  and  bottom  right  of  F  indicate  magnified  images  of  the  cells  highlighted  in  respective  dashed  white  boxes. 


facilitate  vagal  stimulation  of  neuropeptide  Y  receptors  in  the  hy¬ 
pothalamus  and  induce  satiety  (Khandekar  et  al.  2015).  Our  data 
suggest  interesting  parallels  in  expression  between  PP/gamma  cells 
and  serotonergic  neurons,  a  group  of  neurons  that  influence  vari¬ 
ous  cognitive  and  physiological  processes  including  anxiety, 


mood,  sleep,  and  satiety.  We  report  expression  of  FEV,  a  serotoner¬ 
gic  transcription  factor  and  necessary  driver  of  neuronal  matura¬ 
tion  previously  reported  in  mouse  beta  cells  (Ohta  et  al.  2011), 
in  PP/gamma  cells  (average  log2CPM  of  2.172).  Interestingly, 
FEV  has  also  been  implicated  in  beta  cell  differentiation,  and  Fev 
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-/-  mice  exhibit  insulin  production,  insulin  secretion,  and  glu¬ 
cose  clearance  defects  (Ohta  et  al.  2011).  Other  related  signature 
genes  in  PP/gamma  cells  include  TPH1,  encoding  a  tryptophan  hy¬ 
droxylase  essential  for  the  initial  catalysis  of  serotonin,  and 
SLC6A4,  a  serotonin  reuptake  transporter.  Serotonin  colocalizes 
with  insulin  in  beta  cells  and  promotes  GSIS  (Paulmann  et  al. 
2009).  Mice  lacking  TPH1  are  diabetic  and  exhibit  impaired  insulin 
secretion  due  to  a  lack  of  pancreatic  serotonin  (Paulmann  et  al. 
2009).  Elevated  FEV,  TPH1,  and  SLC6A4  expression  suggests  PP / 
gamma  cells  share  a  suite  of  characteristic  genes  with  serotonergic 
neurons  that,  in  the  pancreas,  integrate  central  and  peripheral 
hunger  and  satiety  cues.  We  also  observed  high  PP/gamma  expres¬ 
sion  of  muscarinic  acetylcholine  receptor  M3,  CHRM3,  which 
stimulates  exocrine  pancreatic  amylase  (Gautam  et  al.  2005),  insu¬ 
lin  secretion  (Kong  and  Tobin  2011;  Molina  et  al.  2014),  and 
smooth  muscle  contraction  and  gastric  emptying  (Eglen  et  al. 
1994).  These  data  implicate  the  less  abundant  delta  and  PP/gamma 
cell  types  as  critical  for  islet  function  via  the  integration  of  systemic 
cues  and  warrant  further  studies  to  elucidate  the  function  and 
health  of  these  cells  in  normal  and  diabetogenic  conditions. 

Single-cell  transcriptomes  link  rare  and  common  diabetes  genetic 
risk  genes  to  islet  cell  types 

We  next  sought  to  understand  the  cell  type(s)  involved  in  rare 
forms  of  diabetes,  including  transient/permanent  neonatal  diabe¬ 
tes  (T/PNDM),  CHI  and  MODY,  as  well  as  more  common  forms  of 
islet  dysfunction  and  diabetes  (T1D/T2D).  Monogenic  diabetic  dis¬ 
orders,  including  CHI,  MODY,  and  neonatal  diabetes,  are  charac¬ 
terized  by  mutations  in  a  single  gene,  often  resulting  in  beta  cell 
dysfunction  and  death  (Schwitzgebel  2014).  Five  monogenic  dia¬ 
betes  risk  genes  (Supplemental  Table  Sll;  Hoffmann  and 
Spengler  2012;  Senniappan  et  al.  2013;  Schwitzgebel  2014),  were 
enriched  in  beta  cells  (i.e.,  greater  than  fourfold  change  in  expres¬ 
sion  in  specific  islet  cell  type  relative  to  other  endocrine  cells),  in¬ 
cluding  glucose  transporter  SLC2A2  (data  not  shown),  beta  cell 
maturation  transcription  factor  PDX1,  and  the  sulfonylurea  drug 
target  ABCC8  (Fig.  4C).  PDX1  expression  has  been  reported  in  hu¬ 
man  (Li  et  al.  2016)  and  mouse  (DiGruccio  et  al.  2016)  beta  and 
delta  cells.  Despite  the  modest  number  of  delta  cells  sampled, 
our  data  also  suggest  moderate  PDX1  expression  in  delta  cells 
(four  of  16  delta  cells  with  expression  >16  CPM).  Robust  expres¬ 
sion  of  HADH  in  both  beta  and  delta  cells  (Fig.  4C)  was  confirmed 
by  in  situ  (View  RNA)  analyses  (Fig.  4E,F).  Approximately  386/457 
cells  (84%)  in  HADH  and  INS  labeled  sections  coexpressed  both 
markers  (shown  in  Fig.  4E).  Adjacent  SST/HADH colabeling  yielded 
an  approximately  equal  proportion  (n  =  255/306;  83%)  of  SST-neg- 
ative/HADiY-positive  cells.  Finally,  43/457  (9%)  cells  were  INS  neg- 
ative/HADH  positive,  and  41/306  (13%)  cells  coexpressed  SST  and 
HADH  (shown  in  Fig.  4F)  in  the  respective  serial  sections.  Another 
CHI-associated  gene,  UCP2  (Gonzalez-Barroso  et  al.  2008; 
Senniappan  et  al.  2013),  which  was  reported  to  be  highly  ex¬ 
pressed  in  human  beta  cells  (Liu  et  al.  2013)  and  to  suppress  insulin 
secretion  (Krauss  et  al.  2003),  was  enriched  in  delta  cells  (Fig.  4C). 
Delta  cell  expression  of  monogenic  diabetes  genes  thus  implicate 
this  cell  type  in  the  molecular  genetics  of  rare  islet  dysfunction 
and  diabetes  disorders,  particularly  CHI. 

We  also  investigated  cell  type  expression  patterns  of  536  islet 
expression  quantitative  trait  loci  (eQTL)  target  genes  (Lyssenko 
et  al.  2009;  Dupuis  et  al.  2010;  Dayeh  et  al.  2013;  Fadista  et  al. 
2014;  Kulzer  et  al.  2014;  van  de  Bunt  et  al.  2015).  The  majority 
of  these  genes  (w  =  309;  Supplemental  Table  Sll)  were  lowly  ex¬ 


pressed  in  both  the  endocrine  islet  single-cell  transcriptomes  and 
in  the  paired  bulk  islet  transcriptomes  (Supplemental  Fig.  S16A). 
One  hundred  fifty-nine  additional  genes  did  not  exhibit  a  greater 
than  or  equal  to  fourfold  expression  change  in  any  endocrine  islet 
cell  type.  Of  the  remaining  68  eQTL  genes,  54,  46,  51,  and  43  were 
expressed  in  beta,  alpha,  delta,  and  PP/gamma  cells,  respectively. 
Surprisingly,  beta  and  delta  cells  possessed  the  highest  numbers 
of  cell-type-specific  eQTL  genes  (Supplemental  Table  Sll). 

Genome-wide  association  studies  (GWAS)  have  identified 
more  than  100  loci  associated  with  T2D  and  related  quantitative 
traits  (Mohlke  and  Boehnke  2015).  Because  GWAS  identify  genetic 
variants  associated  with  a  disease,  but  not  the  specific  gene(s)  af¬ 
fected  (Pearson  and  Manolio  2008;  Manolio  2010),  we  took  two 
approaches  to  assess  cell-type  expression  of  patterns  of  putative 
GWAS  genes.  First,  we  compiled  and  examined  a  list  of  197  report¬ 
ed  putative  T1D  and  T2D  GWAS  genes  (Bakay  et  al.  2013;  Nica 
et  al.  2013;  Fadista  et  al.  2014;  Marroqui  et  al.  2015;  Mohlke  and 
Boehnke  2015).  Of  these  genes,  37  were  expressed  in  beta,  24  in 
alpha,  28  in  delta,  and  22  in  PP/gamma  cells  (Supplemental 
Table  Sll).  Similarly,  genes  that  were  cell-type  specific  were  ex¬ 
pressed  at  higher  levels  in  ND  bulk  intact  islets  compared  with 
those  genes  without  cell-type  specificity  (Supplemental  Fig. 
S16B).  Ten  genes  were  uniquely  expressed  In  beta  cells,  including 
MEG3,  a  type  1  diabetes  (TlD)-associated  lincRNA  with  reported 
expression  in  mouse  beta  cells  and  potential  tumor  suppressor  ac¬ 
tivity  (Modali  et  al.  2015),  and  IAPP,  whose  protein  product,  when 
aggregated,  possesses  cytotoxic  properties  that  may  contribute  to 
beta  cell  death  and  dysfunction  in  T2D  (Westermark  et  al.  2011). 
We  also  identified  five  putative  T2D  GWAS  genes  (including 
HHEX)  to  be  uniquely  expressed  in  delta  cells.  To  conduct  a 
more  liberal  analysis  of  putative  GWAS  genes,  we  identified  all  sin¬ 
gle-nucleotide  polymorphisms  (SNPs)  associated  with  polygenic 
diabetes  and  related  traits  from  the  GWAS  catalog  (https://www. 
ebi.ac.uk/gwas/).  For  each  reported  SNP  associated  with  T2D, 
T1D,  fasting  insulin,  fasting  glucose,  and  proinsulin,  we  examined 
the  expression  of  all  genes  overlapping  within  one  megabase  of 
the  chromosomal  locus  and  identified  263  genes  with  cell-type- 
specific  expression  (Supplemental  Table  S12).  Together,  our  obser¬ 
vations  of  cell-type-specific  expression  of  eQTL  and  monogenic 
and  common  (T2D  GWAS)  diabetes  genes  both  confirm  beta 
cell-specific  expression  of  multiple  diabetes-associated  genes 
(MEG3,  DLK1,  SLC2A2,  etc.)  and  implicate  other  cell  types  in  the 
molecular  genetic  pathogenesis  of  diabetes.  In  light  of  recent  stud¬ 
ies  (Zhang  et  al.  2014;  DiGruccio  et  al.  2016)  and  our  data,  which 
suggest  that  delta  cells  may  be  critical  regulators  of  glucose  homeo¬ 
stasis  and  islet  function,  this  provides  a  new  avenue  for  investiga¬ 
tion  of  T2D  pathogenesis,  as  well  as  potentially  new  therapeutic 
targets  and  treatment  options. 

Comparison  of  T2D  and  ND  single-cell  transcriptomes  uncovers 
cell-type-specific  differences  not  detected  in  whole  islets 

Finally,  we  compared  single-cell  transcriptome  profiles  from  T2D 
and  ND  donors  to  identify  differentially  regulated  genes  and  ob¬ 
tain  greater  insight  into  the  molecular  genetic  pathogenesis  of  di¬ 
abetes.  After  unsupervised  hierarchical  clustering  (Fig.  5A)  and  t- 
SNE  analysis  (Supplemental  Figs.  S17,  S18)  using  2754  of  the 
most  highly  expressed  genes  (Supplemental  Table  S4),  we  observed 
that  transcriptomes  clustered  by  cell  type  regardless  of  disease 
state.  As  previously  observed,  clustering  was  not  driven  by  marker 
gene  expression  (Supplemental  Figs.  S19,  S20).  For  regions  of  the 
dendrogram  (Fig.  5A)  where  samples  appeared  to  cluster  by  disease 
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Figure  5.  Single-cell  transcriptome  analyses  identify  cell-type-specific  expression  changes  in  T2D  islets.  (A)T2D  and  ND  single-cell  transcriptomes  cluster 
together  by  cell  type  after  unsupervised  hierarchical  clustering.  (6)  Number  of  each  ND  and  T2D  cell  type  classified  by  marker  gene  expression  as  shown  in 
Figure  2.  The  numbers  of  cells  expected  in  each  condition  based  on  a  y2  test  are  indicated  in  parentheses.  (C-E,  top)  Scatter  plots  of  log2  fold-change  (FC) 
expression  detected  between  T2D  and  ND  samples  from  bulk  intact  RNA-seq  (y-axis)  and  from  Fluidigm  Cl  single-cell  RNA-seq  (x-axis)  from  beta  cells  (left 
plot;  red),  alpha  cells  (middle  plot;  blue),  and  delta  cells  (right  plot;  green).  ( Bottom )  Violin  plots  highlight  examples  of  differentially  expressed  genes  in  one 
single-cell  type.  Dashed  purple  lines  represent  repressed  genes  in  the  respective  T2D  cell  type,  while  dashed  blue  lines  represent  induced  genes.  (*)  FDR  < 
0.05,  (**)  FDR  <  0.01,  (***)  FDR  <  0.001 .  (F)  Venn  diagram  showing  the  intersections  of  differentially  expressed  genes  identified  between  T2D  and  ND  tran¬ 
scriptomes  at  single-cell-type  and  islet  single-cell  ensemble  resolution.  The  islet  single-cell  ensemble  represents  the  pooled  collection  of  beta,  alpha,  delta, 
and  PP/gamma  single  cells. 


state,  we  found  that  islet  donor  identity  was  an  underlying  factor 
that  reflected  sample  subclustering  (Supplemental  Fig.  S21).  We 
obtained  fewer  beta  cells  among  the  T2D  islet  cells  sampled  com¬ 
pared  with  ND  samples  (Fig.  5B).  However,  observed  differences 
in  T2D  and  ND  single-cell  proportions  did  not  differ  significantly 
from  expected  cell-type  proportions  (Fig.  5B,  %2  P-value  =  0.2733), 
and  none  of  the  islets  from  these  newly  diagnosed  T2D  individuals 
exhibited  as  significant  a  decrease  as  previously  reported  (Butler 


et  al.  2003;  Cnop  et  al.  2005;  Donath  et  al.  2005;  Prentki  and 
Nolan  2006). 

Recent  studies  have  reported  features  of  beta  cell  de-differenti- 
ation  under  diabetogenic  and  stress  conditions  (Talchai  et  al.  2012; 
Wang  et  al.  2014;  Cinti  et  al.  2016).  However,  we  did  not  identify 
significant  shifts  in  islet  cell  populations,  increases  in  number  of 
hormone-negative  "none"  cells,  or  appearances  of  new  or  more 
abundant  populations  of  cells  in  T2D  islets  that  clustered  distinctly 
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from  the  known  islet  cell  types  in  this  study.  Moreover,  expression 
of  reported  de-differentiation  genes  including  FOXOl,  NANOG, 
and  POU5F1  (Talchai  et  al.  2012)  did  not  differ  significantly  be¬ 
tween  T2D  and  ND  islet  cell  types  nor  the  paired  bulk  intact  islet 
preparations  (Supplemental  Fig.  S22).  Finally,  other  de-differentia- 
tion  markers  such  as  NEUROG3  and  MYCL  were  not  detected  in  our 
single-cell  or  bulk  intact  islet  data.  Thus,  our  analysis  did  not  iden¬ 
tify  transcriptional  evidence  of  de-differentiated  cells  in  T2D  islets. 

Comparison  of  islet  cell-type  transcriptomes  (e.g.,  T2D  beta 
vs.  ND  beta)  did,  however,  identify  410  genes  that  were  differen¬ 
tially  expressed  (FDR<5%)  between  T2D  and  ND  donors 
(Supplemental  Table  S6)  beta,  (Fig.  5C,  n  =  248),  alpha  (Fig.  5D, 
»  =  138),  and  delta  cells  (Fig.  5E,  n  =  24).  We  also  identified  dif¬ 
ferentially  expressed  genes  in  acinar  (n  =  74),  ductal  (n  =  35),  and 
stellate  (n  =  28)  exocrine  cell  types  (Supplemental  Fig.  S23; 
Supplemental  Table  S6).  T2D  beta  cells  exhibited  a  1.4-fold  de¬ 
creased  INS  expression  compared  with  ND  beta  cells  (Fig.  5C). 
STX1A  was  significantly  reduced  (log2FC  -1.5178)  in  T2D  beta 
cells,  consistent  with  reported  decreases  in  STX1A  protein  levels 
in  T2D  beta  cells  (Andersson  et  al.  2012).  STX1A  combines  with 
SNAP-25  and  VAMP2  to  form  a  tertiary  SNARE  protein  complex 
important  for  insulin  secretion  in  beta  cells  (Andersson  et  al. 
2012),  and  STX1A  inhibition  drastically  reduces  GSIS  and  exocyto- 
sis  (Vikman  et  al.  2006).  Additionally,  we  detected  elevated  DLK1 
expression  in  T2D  beta  cells  (log2FC  2.010),  which  has  been  impli¬ 
cated  in  T1D/T2D  GWAS  (Wallace  et  al.  2010)  and  is  part  of  a  dys- 
regulated  locus  in  T2D  islets  (Kameswaran  et  al.  2014).  Dlkl 
mice  exhibit  increased  glucose  sensitivity  and  insulin  secretion 
(Abdallah  et  al.  2015),  and  high  levels  of  serum  DLK1  have  been 
associated  with  insulin  resistance  in  both  rodents  and  humans 
(Chacon  et  al.  2008).  Immunofluorescence  indicates  that  DLK1 
is  beta  cell  specific  in  human  but  not  mouse  islets  (Li  et  al. 
2016),  and  FACS-enriched  mouse  beta  cells  show  low  expression 
of  Dlkl  in  comparison  to  other  sorted  islet  alpha  and  delta  cells 
(DiGruccio  et  al.  2016),  potentially  implicating  a  unique  role  of 
this  gene  in  human  T2D  progression.  These  findings  suggest  that 
perturbations  in  STX1A  and  DLK1  expression  may  contribute  to 
the  beta  cell  dysfunction  and  impaired  insulin  secretion  that  is 
commonly  observed  in  T2D  pathogenesis. 

Decreased  beta  cell  function  and  mass  are  hallmarks  of  T2D 
pathophysiology  (Cerf  2013;  Halban  et  al.  2014).  Our  analyses  sug¬ 
gest  that  transcriptional  changes  in  nonbeta  cells  may  also  contrib¬ 
ute  to  T2D  pathogenesis.  Specifically,  we  highlight  increased 
expression  of  fatty  acid  translocase  gene  CD36  (log2FC  2.296),  as 
well  as  decreased  expression  of  the  guanine  deaminase  gene, 
GDA  (log2FC  -1.062),  in  T2D  alpha  cells.  Soluble  CD36  is  a  bio¬ 
marker  of  T2D  (Alkhatatbeh  et  al.  2013)  and  diabetic  nephropathy 
(Shiju  et  al.  2015)  and  coordinates  activation  of  the  NLRP3  inflam- 
masome,  leading  to  proinflammatory  cytokine  release  and  re¬ 
duced  insulin  secretion  (Sheedy  et  al.  2013).  Within  T2D  delta 
cell  transcriptomes,  we  note  increased  LAPTM4B  expression 
(log2FC  2.871)  and  drastically  reduced  RCOR1  expression  (log2FC 
-10.128).  The  underlying  biological  significance  of  these  differen¬ 
tially  regulated  genes  remains  unclear  and  thus  requires  further  in¬ 
vestigation  of  their  roles  in  nonbeta  cell  types  and  T2D  pathology. 
We  also  compared  the  transcriptional  differences  between 
T2D  and  ND  endocrine  cells  without  first  segregating  them  into 
islet  cell  types  (334  ND  and  212  T2D  single-cell  profiles). 
Approximately  66%  of  beta  cell-specific  (n  =  165/248),  50%  of  al¬ 
pha  cell-specific  (n  =  67/138),  and  >90%  of  delta-specific  (n  =  23/ 
24)  changes  in  gene  expression  were  missed  when  cell  types 
were  not  defined  and  specifically  compared  (Fig.  5F).  The  de¬ 


creased  heterogeneity  in  the  transcriptional  profiles  of  cell-type- 
specific  comparisons  provides  increased  power  to  detect  the  tran- 
scriptomic  differences  and  argues  the  importance  of  single-cell 
analysis  in  understanding  the  molecular  basis  of  T2D. 

Recent  islet  single-cell  studies  emerged  while  this  study  was 
under  review.  We  therefore  sought  to  validate  our  observed  cell- 
type-specific  differences  in  T2D  islets  using  these  independent 
data  sets  (Wang  et  al.  2016;  Segerstolpe  et  al.  2016).  We  found 
that  54/77  genes  and  32/171  were  also  significantly  up-  and 
down-regulated,  respectively,  in  T2D  beta  cells  in  these  studies 
(P<  0.05,  two-sided  Wilcoxon  rank-sum  test)  (Supplemental  Fig. 
S24A,B;  Supplemental  Table  S13).  Notably,  DLK1  consistently  ex¬ 
hibited  approximately  fourfold  induction  in  T2D  beta  cells  in  each 
study  (Supplemental  Fig.  S24C,D)  Similarly,  39/60  and  14/78 
genes  were  significantly  Induced  or  repressed,  respectively,  in 
T2D  alpha  cells  (Supplemental  Fig.  S24E,F).  This  included  approx¬ 
imately  twofold  CD36  induction  in  each  study  (Supplemental  Fig. 
S24G,H).  Validation  rates  for  delta  cells  was  notably  lower,  likely 
due  to  the  relatively  few  cells  profiled  for  comparison.  However, 
we  did  note  a  significant  increase  (log2FC  1.203)  in  LAPTM4B  in 
T2D  delta  cells  from  Segerstolpe  et  al.  (2016),  consistent  with 
our  data. 

Discussion 

In  this  study,  we  completed  transcriptome  profiling  and  analysis  of 
638  single  islet  cells  from  ND  and  T2D  individuals.  Single-cell 
RNA-seq  protocols  are  often  limited  by  their  capture  efficiency 
due  to  the  fact  that  a  limited  proportion  of  each  cell's  total  tran¬ 
scripts  is  represented  in  the  final  sequencing  library  (Liu  and 
Trapnell  2016).  Additionally,  these  approaches  have  difficulty  de¬ 
tecting  expression  and  changes  in  expression  of  low  abundance 
transcripts.  Despite  these  limitations,  we  observed  a  strong  correla¬ 
tion  between  the  transcriptomes  of  paired  bulk  islets  and  single 
cells,  indicating  these  are  high-quality  and  representative  data 
sets.  Based  on  single-cell  transcriptome  profiles,  we  have  identified 
cells  of  each  endocrine  (alpha,  beta,  delta,  PP/gamma,  epsilon)  and 
exocrine  (stellate,  ductal,  acinar)  type  in  the  pancreas  in  an  agnos¬ 
tic  and  data-driven  manner. 

This  approach  has  defined  expression  signatures  of  each  cell 
type  with  single-cell  precision.  Cell-type-specific  expression  pat¬ 
terns  in  our  data  such  as  MAFA  in  beta  cells  and  IRX2  in  alpha  cells 
are  concordant  with  and  extend  those  generated  on  a  smaller  set  of 
cells  and  an  independent  platform  (Li  et  al.  2016).  Notably,  our  ap¬ 
proach  also  uncovered  important  instances  of  shared  expression 
between  these  cell  types  and  the  less  common  delta  and  PP/gam¬ 
ma  islet  populations,  including  genes  mutated  in  CHI  (HADH) 
and  transcription  factors  regulating  cell  fate/identity  (ARX). 
HADH  encodes  hydroxyacyl-CoA  dehydrogenase,  an  important 
enzyme  and  negative  regulator  of  glutamate  dehydrogenase 
(GDH)  and  insulin  secretion.  Expression  of  HADH  in  islets  has 
been  shown  to  be  beta  cell  specific  (Kapoor  et  al.  2010;  Pepin 
et  al.  2010),  and  indeed,  knockdown  of  HADH  in  rat  832/13  beta 
cells  increases  insulin  secretion  (Pepin  et  al.  2010).  Surprisingly, 
our  combined  transcriptomic  analyses  and  in  situ  (ViewRNA)  val¬ 
idation  of  HADH  revealed  shared  expression  in  beta  and  delta  cells. 
These  findings  suggest  delta  cell  dysfunction,  in  addition  to  beta 
cell  dysfunction,  as  potential  contributing  factors  to  the  develop¬ 
ment  of  monogenic  diabetic  disorders. 

Most  importantly,  analysis  of  the  delta  and  PP/gamma  islet 
cell  transcriptomes  revealed  cell-type-specific  expression  of  multi¬ 
ple  genes  that  suggest  important  roles  for  these  cells  in  islet 
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physiology  and  the  molecular  genetics  of  islet  dysfunction  in  rare 
(e.g.,  PNDM,  TNDM,  and  MODY)  and  common  (e.g.,  T2D)  forms 
of  diabetes.  The  novel  transcriptome  signatures  uncovered  for  hu¬ 
man  delta  and  PP/gamma  cells  includes  genes  that  strongly  suggest 
important  roles  for  each  cell  type  in  sensing  and  integrating  specif¬ 
ic  systemic  cues  to  govern  islet  (dys)function.  This  clearly  warrants 
additional  work  to  better  understand  the  regulation  and  function 
of  these  cells  in  normal  and  diabetic  states.  New  cell  surface  mark¬ 
ers  identified  for  each  of  these  cell  types  could  be  used  to  specifi¬ 
cally  enrich  and  purify  these  populations  for  detailed  functional 
analysis. 

Finally,  by  comparing  single-cell  transcriptomes  from  T2D 
and  ND  islets,  we  were  able  to  study  quantitative  changes  in 
cell  populations  and  cell-type-specific  expression  in  T2D  patho¬ 
genesis.  Although  not  reaching  statistical  significance,  we  did  ob¬ 
serve  a  trend  of  decreased  beta  cells  in  T2D  islets  versus  ND  islets. 
This  difference  was  not  as  pronounced  as  in  previous  reports,  pos¬ 
sibly  due  to  the  relatively  modest  number  of  cells  sampled  per  in¬ 
dividual.  Alternatively,  as  most  of  the  T2D  islet  single-cell 
transcriptomes  came  from  newly  diagnosed  individuals,  this  dif¬ 
ference  may  also  reflect  the  shorter  duration  or  decreased  severity 
of  T2D  in  these  samples  compared  with  other  studies.  Previous 
studies  suggested  that  beta  cell  de-differentiation  may  underlie 
beta  cell  loss  in  T2D  (Talchai  et  al.  2012;  Wang  et  al.  2014; 
Cinti  et  al.  2016).  However,  a  subsequent  study  comparing  hu¬ 
man  islets  from  14  T2D  and  13  ND  individuals  did  not  identify 
clear  evidence  of  this  phenomenon  (Butler  et  al.  2016). 
Similarly,  our  data  do  not  provide  transcriptome-based  evidence 
of  fraus-differentiation  or  de-differentiation  phenomena  in  T2D 
islets.  We  observed  neither  the  appearance  of  new  or  distinct  sub¬ 
populations  among  the  T2D  islet  single  cells  nor  significant 
changes  of  reported  de-differentiation  genes  between  T2D  and 
ND  cell  types  (e.g.,  T2D  beta  cells  vs.  ND  beta  cells),  although  it 
is  possible  that  de-differentiated  cells  were  simply  not  captured 
in  our  study.  Overall,  we  identify  248,  138,  and  24  genes  exhib¬ 
iting  differential  expression  in  T2D  versus  ND  beta,  alpha,  and 
delta  cells,  respectively.  Consistent  with  Simpson's  paradox,  ap¬ 
proximately  half  of  these  genes  in  each  major  islet  cell  type 
(64%  beta,  45%  alpha)  and  ~90%  of  these  in  the  less  abundant 
delta  cells  were  not  detected  in  whole  islet  or  single-cell  islet  tran¬ 
scriptomes  when  they  were  not  stratified  by  cell  type  (Simpson 
1951;  Trapnell  2015).  Each  of  these  differentially  regulated  genes 
may  represent  important  new  candidate  genes  in  T2D  pathogen¬ 
esis  and  therapeutic  targeting. 

Methods 

Islet  acquisition,  processing,  and  dissociation 

Islets  were  procured  from  ProdoLabs  or  the  Integrated  Islet 
Distribution  Program  (IIDP)  and  shipped  in  PIM(T)  media 
(ProdoLabs)  overnight  on  cold  packs.  Upon  arrival,  islets  were 
washed  and  transferred  into  PIM(S)  media  with  PIM(G)  and  PIM 
(ABS)  supplements  according  to  the  manufacturer's  instructions 
(ProdoLabs)  and  incubated  at  37°C  in  a  5%  C02  tissue  culture  in¬ 
cubator.  Twenty-four  hours  after  transfer,  approximately  500  islet 
equivalents  (IEq)  were  aliquoted  and  centrifuged  at  180^  for  3  min 
at  room  temperature  (RT).  One  aliquot  (100  IEq)  was  immediately 
flash  frozen  (Fig.  1A,  baseline),  one  (200  IEq)  was  resuspended  in  1 
mL  Prodo-media  (Fig.  1A,  intact),  and  one  (200  IEq)  was  resus¬ 
pended  in  1  mL  Accutase  (Innovative  Cell  Technologies)  (Fig. 
1A,  dissociated  and  single  cell)  and  incubated  for  10  min  in  a  37° 
C  water  bath,  with  pipetting  every  2  min.  Accutase-dissociated 


cells  were  filtered  through  a  prewet  cell  strainer  (BD)  to  collect  sin¬ 
gle  cells,  rinsed  with  9  mL  prewarmed  CMRL  +  10%FBS  media  to 
stop  the  reaction,  and  centrifuged  at  180^  for  3  min  at  RT. 
Dissociated  cells  were  resuspended  in  300  pL  CMRL  +  10%FBS  me¬ 
dia.  Cell  size,  number,  and  viability  were  assessed  using  Countess  II 
FL  (Thermo  Fisher  Scientific),  and  the  cell  suspension  was  diluted 
to  a  final  concentration  of  300  cells/pL.  Total  processing  and  han¬ 
dling  time  for  each  islet  was  <60  min. 

Single-cell  processing  on  the  Cl  single-cell  Autoprep  system 

After  counting,  cells  were  diluted  to  a  final  concentration  range  of 
250-400  cells/gL  and  5  pL  loaded  onto  each  Cl  integrated  fluidic 
circuit  (IFC;  10-  to  17-pm  chip)  for  cell  capture  on  the  Cl  single¬ 
cell  Autoprep  system.  For  each  islet  preparation,  up  to  two  micro¬ 
fluidic  chips  were  used.  After  capture,  cells  were  imaged  within 
each  capture  nest  with  an  EVOS  FL  auto  microscope  (Life 
Technologies).  IFCs  were  subsequently  loaded  with  additional  re¬ 
agents  for  subsequent  cell  lysis;  SMARTer  vl-  based  (Clontech), 
olio-(dT)-primed  reverse  transcription;  template  switching  for  sec¬ 
ond-strand  priming;  and  amplification  of  cDNA  on  the  Cl  System. 
Qualitative  and  quantitative  analysis  of  all  single-cell  cDNA  prod¬ 
ucts  was  performed  on  a  96  capillary  fragment  analyzer  (Advanced 
Analytical).  Only  cell  singlets,  as  determined  by  imaging,  with  ad¬ 
equate  cDNA  yield  and  quality  were  processed  for  subsequent  se¬ 
quencing.  Fragmentation  and  tagmentation  of  cDNA  was  done 
with  Nextera  XT  reagent  (Illumina)  using  dual  indices  to  prepare 
single-cell  multiplexed  libraries. 

Bulk-cell  RNA-seq 

Bulk  cells  were  pelleted  and  RNA  purified  using  the  PicoPure  RNA 
isolation  kit  (Life  Technologies).  All  RNA-seq  libraries  from  bulk- 
sample  RNA  were  generated  with  the  same  SMARTer  vl  chemistry 
(Clontech)  as  for  the  Cl  single-cell  data  largely  following  the  man¬ 
ufacturer's  instructions.  Unlike  the  Cl  workflow,  after  first-strand 
DNA  synthesis,  cDNA  was  purified  using  Agencourt  AMPure  beads 
(Beckman  Coulter).  cDNA  was  subsequently  amplified  through  12 
PCR  cycles.  The  cDNA  yield  and  fragment  size  were  measured  on  a 
2100  Bioanalyzer  (Agilent).  For  sequencing  library  preparation, 
amplified  cDNA  was  sheared  using  a  Covaris  LE220  system  to  ob¬ 
tain  fragments  of  ~200  bp.  The  fragmented  cDNA  was  prepared 
for  sequencing  using  the  NEBNext  DNA  library  prep  kit  for 
Illumina  sequencing  (New  England  Biolabs). 

Sequencing,  read  mapping,  and  quality  control 

All  sequencing  was  performed  on  a  NextSeq500  (Illumina)  using 
the  75-cycle  high-output  chip.  RNA-seq  reads  were  subjected  to 
quality  control  using  custom  scripts  developed  at  the  computa¬ 
tional  sciences  group  at  The  Jackson  Laboratory.  Briefly,  reads 
with  >30%  of  bases  with  quality  scores  less  than  30  were  removed 
from  the  analysis,  and  samples  with  >50%  of  the  low-quality  reads 
were  removed  from  the  cohort.  Trimmed  reads  were  mapped  to  hu¬ 
man  transcriptome  (GRCh37,  Ensembl  v70)  using  Bowtie  2 
(Langmead  and  Salzberg  2012),  and  expression  levels  of  all  genes 
were  estimated  using  RSEM  (Li  and  Dewey  2011).  Transcript  per 
million  (TPM)  values  as  defined  by  RSEM  were  added  a  value  of 
one  prior  to  log2  transformation  to  avoid  zeros.  GRCh3  7  was  select¬ 
ed  for  mapping  to  facilitate  integration  and  comparative  analyses 
with  existing  islet  data  sets  (e.g.,  Parker  et  al.  2013;  Fadista  et  al. 
2014;  van  de  Bunt  et  al.  2015)  and  ENCODE  and  NIH  Roadmap 
data  by  members  of  the  islet  biology,  diabetes,  and  functional  ge¬ 
nomics  communities.  The  observation  of  expected  "positive  con¬ 
trol"  genes  for  each  cell  type  strongly  suggested  that  mapping  to 


Genome  Research  1 1 

www.genome.org 


39 


Downloaded  from  genome.cshlp.org  on  February  20,  2017  -  Published  by  Cold  Spring  Flarbor  Laboratory  Press 


Lawlor  et  al. 


GRCh37  instead  of  GRCh38  did  not  mask  or  alter  any  important 
conclusions  that  could  be  drawn  from  the  data. 

Single-cell  sample  processing  and  quality  filtering 

We  used  26,616  protein-coding  genes  and  lincRNAs  from  the 
GRCh37,  Ensembl  v70  build  in  our  study.  Genes  with  expression 
five  or  more  TPMs  in  a  sample  were  defined  as  expressed. 
Seventy-two  single-cell  samples  that  expressed  fewer  than  3500 
genes  according  to  these  criteria  were  removed  from  downstream 
analysis. 

Islet  cell  type  classification 

GMM  of  islet  marker  genes  was  performed  on  a  per  gene  basis  us¬ 
ing  the  R-package  mclust_5.2  (Scrucca  et  al.  2016).  Each  single-cell 
sample  was  classified  as  a  specific  pancreatic  cell  type  if  and  only  if 
a  single  gene  from  the  selected  marker  gene  list — INS  (beta),  GCG 
(alpha),  SST  (delta),  PPY  (PP/gamma),  KRT19  (ductal),  PRSS1  (aci¬ 
nar),  and  COL1A1  (stellate) — was  expressed  in  the  sample  and 
none  of  the  other  marker  genes  were  expressed.  Cells  expressing 
no  marker  genes  were  labeled  as  "none,"  and  those  expressing 
>1  marker  gene  were  labeled  as  "multiple."  Fluidigm  released  a 
white  paper  report  detailing  the  potential  for  single  cells  to  "z- 
stack"  in  up  to  30%  of  capture  nests  on  the  medium  (10-17  pm) 
Fluidigm  Cl  chip  (http://info.fluidigm.com/rs/673-MRG-416/ 
images/Cl -Med-96-IFC-Redesign_wp_101-3328Bl_FINAL.pdf). 
DAPI  staining  identified  z-stacked  islet  cell  doublets  in  10%  and 
30%  of  capture  nests  from  two  additional  Cl  single-cell  captures. 
Because  the  proportion  of  "multiple"  labeled  cells  approximately 
equaled  that  of  z-stacked  doublets,  we  discarded  these  cells  (n  = 
340)  from  subsequent  analyses. 

Unsupervised  dimensionality  reduction  and  hierarchical 
clustering 

Barnes-Hut  variant  of  t-SNE  (van  der  Maaten  2014)  was  imple¬ 
mented  using  the  R-package  Rtsne_0.10  (https://github.com/ 
jkrijthe/Rtsne).  ND  single-cell  transcriptomes  were  reduced  to 
two  dimensions  in  an  unsupervised  manner  using  genes  with  log2_ 
CPM  values  greater  than  10.5  in  at  least  one  sample.  Similar  anal¬ 
yses  were  repeated  using  only  the  T2D  single-cell  data  and  the 
combined  single-cell  data.  Hierarchical  clustering  of  cell  transcrip¬ 
tomes  was  performed  using  Euclidean  distance,  Ward.D2  linkage, 
and  the  same  gene  selection  criteria.  Resultant  "fan"  dendrogram 
images  were  produced  using  the  R-packages  dendextend_1.1.8 
(Galili  2015)  and  ape_3.S  (Paradis  et  al.  2004).  Bulk  islet  transcrip¬ 
tomes  were  clustered  using  the  same  criteria. 

Supervised  differential  gene  expression  analysis 

Differential  expression  analyses  were  performed  using  the 
Bioconductor  package  edgeR_3.14.1  (Robinson  et  al.  2010). 
Gender  was  used  as  a  blocking  factor  to  account  for  variability  be¬ 
tween  male  and  female  patient  islets.  In  each  comparison,  protein¬ 
coding  genes  and  lincRNAs  with  20  or  fewer  counts  in  at  least  20% 
of  either  cell  type  population  being  compared  or  at  least  a  mini¬ 
mum  of  three  cells  were  used.  Differentially  expressed  genes  with 
FDR  <  5%  were  regarded  as  significant  results.  Endocrine  cell  signa¬ 
ture  genes  were  identified  by  first  performing  the  above  differential 
expression  analysis  procedure  between  each  endocrine  cell  type 
(e.g.,  beta  vs.  alpha,  beta  vs.  delta,  and  beta  vs.  PP/gamma). 
Afterward,  the  intersection  of  these  results  was  performed  to  iden¬ 
tify  genes  exclusively  enriched  in  the  cell  type.  Exocrine  cell  signa¬ 
ture  genes  were  identified  after  pairwise  comparisons  between 
each  respective  exocrine  cell  type  (e.g.,  acinar  vs.  stellate,  acinar 


vs.  ductal).  Comparisons  between  T2D  and  ND  single-cell  tran¬ 
scriptomes  were  performed  for  the  same  cell  types  (e.g.,  T2D  beta 
cells  vs.  ND  aeta  cells)  to  identify  cell-type-specific  differences  in 
gene  expression  between  T2D  and  ND  states. 

ANOVA  and  post-hoc  analyses 

For  each  collection  of  diabetes-associated  and  eQTL  genes  exam¬ 
ined,  one-way  analysis  of  variance  (ANOVA)  was  used  to  identify 
statistically  significant  differences  (FDR  >5%)  in  islet  cell-type 
gene  expression.  Following  this,  we  performed  a  post  hoc  analysis 
using  a  THSD  test  to  determine  genes  with  cell-type-specific  ex¬ 
pression  patterns  (fold  change  >4).  Genes  were  classified  as 
"pan-islet"  if  they  had  an  average  log2(CPM)  expression  greater 
than  four  in  all  islet  cell  types.  Genes  that  were  not  enriched  in  a 
cell  type  or  pan-islet  were  classified  as  "lowly  expressed"  (average 
log2(CPM)  <  2  in  all  cell  types),  and  the  remaining  genes  were  clas¬ 
sified  as  "less  than  fourfold  change."  This  same  methodology  was 
used  to  characterize  expression  of  the  previously  reported  alpha- 
and  beta-specific  genes  from  Dorrell  et  al.  (2011b),  Bramswig 
et  al.  (2013),  Nica  et  al.  (2013),  and  Blodgett  et  al.  (2015).  Similar 
methods  were  used  to  characterize  expression  patterns  of  genes 
nearby  diabetes-related  GWAS  SNPs  (downloaded  from  the 
GWAS  Catalog,  https://www.ebi.ac.uk/gwas/,  and  available  in 
Supplemental  Table  SI 2).  Protein-coding  RNAs  and  lincRNAs 
that  overlapped  within  one  megabase  upstream  of  and  down¬ 
stream  from  the  diabetes-associated  SNPs  were  analyzed. 

Gene  set  enrichment  analysis 

The  Bioconductor  package  gage_2.22.0,  (Luo  et  al.  2009)  was  used 
with  default  parameters  to  identify  enrichment  (FDR  <  5%)  of  hu¬ 
man  Kyoto  Encyclopedia  of  Genes  and  Genomes  (KEGG)  path¬ 
ways  in  each  of  the  ND  islet  cell  transcriptomes.  Enriched 
pathways  were  determined  by  comparing  each  cell-type  transcrip- 
tome  against  the  other  aggregate  islet  cell-type  transcriptomes 
(e.g.,  beta  vs.  alpha,  delta,  and  PP/gamma). 

RNA  in  situ  hybridization 

RNA  transcripts  were  visualized  in  OCT-embedded  pancreatic  islet 
sections  from  two  ND  donors  (P3  and  P4)  using  QuantiGene 
ViewRNA  ISH  cell  assay  kit  (catalog  no.  QVC0001,  Affymetrix). 
Human  INS  ViewRNA  type  6  probe  (Catalog  no.  VA6-13248-06), 
SST  ViewRNA  type  6  probe  (catalog  no.  VA6-17244-06),  LEPR 
ViewRNA  type  1  probe  (catalog  no.  VA1-15221-06),  and  HADH 
ViewRNA  type  1  probe  (catalog  no.  VA1-12106-06)  were  purchased 
from  Affymetrix  (Santa  Clara).  The  assay  was  performed  according 
to  the  cell-based  ViewRNA  assay  protocol  with  a  15-min  formalde¬ 
hyde  fixation  and  a  10-min  protease  treatment  (dilution  factor 
1:4000).  ViewRNA  probes  were  detected  at  550  nm  (Cy3)  and 
650  nm  (Cy5)  using  a  Leica  TSC  SP8  confocal  microscope  at  63x 
magnification. 

Islet  cell  subpopulation  analyses 

Dorrell  et  al.  201 6  previously  defined  four  beta  cell  subpopulations 
with  differing  expression  of  59  genes.  With  this  gene  set,  we  at¬ 
tempted  to  validate  the  presence  of  these  four  subpopulations 
via  unsupervised  t-SNE  and  hierarchical  clustering  of  all  ND  beta 
cell  transcriptomes  (/;  =  168).  Similarly,  Bader  et  al.  (2016)  charac¬ 
terized  proliferative  (F7fp+/FVR+)  and  mature  (Fltp~l FVR~)  mouse 
beta  cells  that  showed  differential  expression  of  996  transcripts. 
By  using  the  Mouse  Genome  Informatics  (MGI;  http://www. 
informatics.jax.org)  database,  these  996  transcripts  corresponded 
to  691  human  orthologs  that  were  detected  in  our  data  set.  Beta 
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cell  transcriptomes  were  clustered  using  these  orthologs  to  attempt 
to  identify  mature  and  proliferating  subpopulations.  Finally,  we 
used  the  functions  available  in  scran_1.04  (http: //bioconductor, 
org/packages/release/bioc/html/scran.html)  to  computationally 
assign  single-cell  samples  into  cell  cycle  phases  (Gl,  G2/M,  or  S 
phase)  and  investigate  whether  our  data  set  contained  proliferat¬ 
ing  islet  cells. 

Data  access 

Raw  sequence  data  from  this  study  have  been  submitted  to  the  da¬ 
tabases  of  NCBI  Sequence  Read  Archive  (SRA;  http://www.ncbi. 
nlm.nih.gov/sra)  under  accession  number  SRP075970  and 
BioProject  (http://www.ncbi.nlm.nih.gov/bioproject/)  under  ac¬ 
cession  number  PRJNA323853.  Processed  data  sets  from  this 
study  have  been  submitted  to  Gene  Expression  Omnibus  (GEO; 
http://www.ncbi.nlm.nih.gov/geo/)  under  accession  number 
GSE86473.  UCSC  Genome  Browser  tracks  of  aggregate  ND  islet  sin¬ 
gle-cell-type  transcriptomes  are  available  at  http://genome.ucsc. 
edu/  and  may  be  accessed  with  the  user  name  "Iawlorn"  and  ses¬ 
sion  name  "Islet_Single_Cell_Type_Transcriptomes."  The  source 
code  used  to  produce  the  figures  and  tables  in  this  paper  is  avail¬ 
able  in  the  SupplementaLMethods_Source_Code  folder  as  sug¬ 
gested  by  Hoffman  (2016). 
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CellView:  Interactive  exploration  of  high  dimensional  single  cell  RNA-seq  data 

Mohan  T.  Bolisetty1,  Michael  L.  Stitzel1,2,3  &  Paul  Robson1 2  3 

Advances  in  high-throughput  single  cell  transcriptomics  technologies  have  revolutionized  the 
study  of  complex  tissues.  It  is  now  possible  to  measure  gene  expression  across  thousands  of 
individual  cells  to  define  cell  types  and  states.  While  powerful  computational  and  statistical 
frameworks  are  emerging  to  analyze  these  complex  datasets,  a  gap  exists  between  this  data  and 
a  biologist’s  insight.  The  CellView  web  application  fills  this  gap  by  providing  easy  and  intuitive 
exploration  of  single  cell  transcriptome  data. 

Recent  technological  advances  in  single  cell  capture  and  nano-scale  reactions  have  led  to  a  major 
revolution  in  single  cell  transcriptomics1,2,3.  Single  cell  datasets  are  analyzed  using  computational  and 
statistical  frameworks  that  enable  feature  (gene)  selection,  dimensionality  reduction,  clustering  and 
differential  gene  expression.  Multiple  software  packages  exist  that  allow  researchers  well  versed  in 
computational  analysis  to  perform  this  analysis4-6.  However,  identifying  the  exact  parameters  required  for 
cell  type  identification  is  an  iterative  process  greatly  improved  when  informed  by  biology.  In  addition, 
interactive  exploration  of  single  cell  datasets  incorporating  a  biologist’s  knowledge  greatly  improves  data 
interpretation,  yet  often  such  experts  do  not  have  big  data  handling  skills. 

Advances  in  web  application  frameworks  and  visualization  methods  for  dense  datasets  facilitate  the 
development  of  interactive  applications  to  allow  easy  and  intuitive  exploration  of  single  cell  data.  Here, 
we  introduce  an  R  Shiny7  web  application,  CellView,  that  allows  knowledge-based  and  hypothesis-driven 
exploration  of  processed  single  cell  transcriptomic  data.  The  input  into  CellView  is  an  R  dataset  (.Rds)  file 
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with  three  pre-computed  data  frames  containing  expression,  clustering,  and  gene  symbol  information. 
This  file  is  agnostic  of  upstream  computational  approaches  providing  flexibility  in  algorithms  used  to 
calculate  these  data  frames.  This  .Rds  file  can  be  shared  with  the  end  user,  eliminating  the  need  for 
hosting  datasets,  thereby  decreasing  the  size  of  a  virtual  machine  or  cloud  instance  required  to  host  and 
use  CellView.  Multiple  tabs  allow  for  easy  access  to  the  data  and  visualization  of  gene  expression  across 
and  within  clusters,  aiding  cell  type  identification. 

To  illustrate  the  utility  and  power  of  CellView,  we  generated  and  analyzed  single  cell  transcriptome  data 
from  peripheral  blood  mononuclear  cells  (PBMCs)  using  the  10X  Genomics  Chromium8.  As  defined  by 
the  CellRanger8  pipeline,  this  data  consisted  of  6,554  single  cells  sequenced  to  90.1%  saturation  with,  on 
average,  824  genes  and  2,077  molecules  detected  per  cell.  Dimensionality  reduction  using  tSNE9  was 
applied  to  genes  selected  by  normalized  dispersion,  and  with  clustering  by  DBSCAN10.  CellView 
automatically  determines  cluster  numbers,  updates  the  user  interface,  and  renders  a  3D  scatter  plot 
displaying  cells  clustered  in  tSNE  space  (Fig  1b)  from  the  uploaded  .rds  file. 

The  ‘Explore’  tab  provides  cluster-centric  exploration  through  three  panel  views.  Panel  1  displays  a  3D 
plot  of  a  chosen  gene’s  expression  across  all  cells.  Panel  2  displays  a  2D  plot  of  the  same  gene’s 
expression  across  all  cells  in  a  single  cluster,  which  users  can  select  via  drop-down  list.  Within  Panel  2 
users  can  download  a  .csv  file  of  a  gene-cell  expression  matrix  by  selecting  cells  with  a  square  brush 
stroke.  This  provides  convenient  access  to  all  genes  expressed  in  a  subset  of  cells.  Panel  3  displays 
violin  plots  of  the  chosen  gene’s  cluster-specific  expression  and  includes  a  total  cell  count  for  each 
cluster.  CD79A  (Fig  1c),  a  marker  of  B-cells,  and  CD3D,  a  marker  of  T-cells  (Fig  Id),  provide 
representative  views  of  the  ‘Explore’  tab. 

The  ‘Co-expression’  tab  enables  the  generation  of  heatmaps  to  visualize  expression  of  multiple  genes 
either  across  all  clusters,  in  the  ‘AllClusters’  sub-menu,  or  on  selected  cells  within  a  cluster,  in  the 
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‘Selected  cells’  sub-menu  (Fig  2a).  The  number  of  genes  analyzed  is  only  limited  by  legibility  of  the  gene 
symbols  in  the  resulting  heatmap.  This  feature  facilitates  the  use  of  known  markers  to  empirically 
determine  cell  (sub)cluster  identity. 

The  identification  of  doublets  in  single  cell  transcriptome  data  remains  computationally  challenging.  The 
interactive  nature  of  CellView  aids  in  cell  doublet  identification.  In  the  PBMC  data,  ‘Subcluster-analysis’ 
reveals  a  mixture  of  lymphoid  and  myeloid  gene  expression  within  cluster  7  suggesting  this  cluster 
consists  of  doublets.  Cluster  7  represents  2.3%  of  all  cells  in  this  data  set,  reflecting  the  number  of 
expected  doublets  for  the  quantity  of  cells  processed  in  this  experiment.  Thus,  CellView  can  be  utilized 
as  a  tool  to  pre-process  of  single  cell  data  and  remove  doublets  prior  to  final  visualization. 

The  ‘Subcluster-analysis’  tab  also  provides  a  powerful  tool  to  identify  different  cell  types  within  clusters 
(where  trade-offs  between  sensitivity  and  specificity  in  the  chosen  clustering  algorithm  may  be  insufficient 
to  identify  unique  clusters)  or  a  continuum  of  states  within  a  cell  type.  For  example,  blood  monocytes 
span  a  continuum  of  classical,  intermediate,  and  non-classical  subtypes  in  flow  cytometry  analysis  of  cell 
surface  markers  CD14  and  CD1611.  Two  populations  of  cells  within  a  cluster  can  be  selected  by  square 
brush  strokes  for  differential  gene  expression  analyses  (using  a  likelihood  test)  to  identify  biologically 
informative  markers.  For  example,  monocytes  occupying  cluster  4  in  the  PBMC  data  appear  to  contain 
two  lobes  (Fig. 2b).  Differential  expression  between  these  two  lobes  using  the  ‘Subcluster-analysis’  tool 
identified  CD16/FCGR3A  as  the  most  differentially  expressed  gene  marking  the  smaller  lobe.  This  lobe 
also  contained  higher  expression  of  MHC  class  II  genes,  an  additional  feature  of  non-classical  blood 
monocytes.  CD14  is  among  the  top  10  up-regulated  genes  in  the  large  lobe,  which  include  other  classical 
blood  monocyte  markers  (e.g.  S100A8,  S100A9,  S100A12).  Thus,  this  blood  monocyte  continuum 
defined  by  two  cell  surface  molecules  is  detected  by  this  transcriptome  cytometry  approach  and 
represented  by  837  parameters  (i.e.  genes)  per  cell.  CellView  data  visualization  may  enable 
immunologists  to  explore  further  underlying  biology  within  the  blood  monocyte  compartment,  such  as 
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investigating  a  subset  of  cells  within  the  intermediate  sub-cluster  expressing  C1QA ,  C1QB ,  and  C7QC, 
markers  of  macrophage  in  tissue. 

Dendritic  cells  (DCs)  occupy  clusters  6  and  8  in  the  PBMC  data.  Cluster  6  represents  plasmacytoid  DCs, 
expressing  CLEC4CICD303 ,  CD68,  IL3RA/CD123  and  LILRA4/CD85g.  Myeloid  DCs  comprise  cluster  8. 
CellView’s  ‘Subcluster-analysis’  tool  enables  identification  of  both  the  common  CD1C+  DC  (Fig.  2c; 
113/215  cells  expressing  CLEC10A,  CD1C)  and  less  adundant  CD141+  DCs  (Fig.  2d;  12/215  cells 
expressing  CLEC9A,  IRF8).  An  additional  layer  of  data  we  include  in  our  .rds  files  are  the  genes  and 
unique  molecular  identifiers  (UMIs)  detected  per  cell;  this  can  enable  identification  of  cell  type  biological 
features  since  RNA  abundance  (and  therefore  UMI  count)  often  correlates  with  cell  size12.  Notably,  non- 
classical  blood  monocytes  and  myeloid  dendritic  cells  have  the  greatest  numbers  of  UMIs  detected  per 
cell,  at  3,719  and  5,645  respectively.  In  contrast,  remaining  cells  have  2,305  UMIs  per  cell.  Myeloid  DCs 
are  not  noticeably  larger  than  other  PBMCs  and  non-classical  blood  monocytes  are  somewhat  smaller  in 
size  than  classical  blood  monocytes11  suggesting  the  RNA  content  is  reflective  of  an  underlying  biological 
feature  of  these  cells  rather  than  cell  size  and  may  reflect  the  precursor  relationship  between  non- 
classical  monocytes  and  myeloid  dendritic  cells13. 

We  next  applied  CellView  to  human  pancreatic  islet  single  cell  transcriptome  data  we  generated  on  the 
Chromium  system  from  a  nondiabetic  normal  donor,  which  resulted  in  4,806  cells  sequenced  to  87.1% 
saturation  and  detecting,  on  average,  1,848  genes  and  7,686  molecules  detected  per  cell.  Our  pipeline 
identified  eight  distinct  clusters  (Fig  1e).  Using  CellView’s  ‘AllClusters’,  and  marker  genes  we  had 
previously  used  to  cell  type  in  human  islets14,  we  identified  endocrine  alpha,  beta,  delta,  and  gamma  cell 
clusters  and  exocrine  acinar,  ductal,  and  stellate  cell  clusters.  An  8th  cluster  represented  endothelial  cells 
(Fig  If).  Visual  inspection  of  the  3D  scatter  plot  displaying  cells  in  tSNE  space  indicated  two  sub-clusters 
within  the  defined  stellate  cell  cloud.  The  ‘Sub-cluster’  tool  revealed,  in  addition  to  the  stellate  cells,  a 
sub-cluster  expressing  the  pericyte  marker  RGS515.  The  close  proximity  of  stellate  cells  and  pericytes  are 
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likely  a  result  of  their  shared  mesenchymal  origin,  as  both  express  COL1A1  and  ACTA2.  Visual 
inspection  of  the  ductal  cell  cluster  identified  a  spread  of  cells  suggestive  of  a  continuum  of  cell  states. 
Differential  expression  between  cells  at  opposing  ends  of  this  continuum  using  the  ‘Sub-cluster’  tool 
identified  biologically  meaningful  differences.  While  all  cells  expressed  KRT19,  there  was  a  transition 
from  a  REG1AIAMBP-pos\t\ve  (Fig  1g)  to  a  TFF1/TFF2/TFF3/FGF19/CAECAM6-pos\t\ve  (Fig  1h) 
population.  Whether  these  represent  different  spatially  localized  populations  of  epithelial  cells  within  the 
pancreatic  duct  or  different  states  of  activation  remains  to  be  determined,  but  further  highlights  the  utility 
of  CellView  to  uncover  putative  novel  biology. 

These  examples  illustrate  how  CellView  provides  a  powerful  complement  to  current  command  line 
approaches  to  cluster  and  identify  cell  types  in  single  cell  experiments.  This  intuitive  web  application 
enables  collaboration  between  biologists  and  computational  analysts  and  increases  the  value  of  each 
single  cell  dataset.  Moreover,  the  CellView  framework  provides  a  useful  format  to  present  these  data  in 
an  interactive  manner  and  can  be  broadly  applied  to  single  cell  and  bulk  genomics  assays  with  count 
matrix  and  cluster  information.  Until  a  complete  atlas  of  cell-type  transcriptomes  has  been  defined,  where 
a  reference-based  approach  may  prove  more  powerful  for  clustering  and  cell  type  identification16, 
CellView  provides  a  useful  tool  to  explore  and  characterize  single  cell  data. 

METHODS 

Single  cell  RNA-seq  -  PBMCs  were  purchased  from  AllCells,  thawed  quickly  at  37°C  and  into  DMEM 
supplemented  with  10%  FBS.  Cells  were  quickly  spun  down  at  400g,  for  lOmin.  Cells  were  washed  once 
with  1  x  PBS  supplemented  with  0.04%  BSA  and  finally  resuspended  in  1  x  PBS  with  0.04%  BSA. 
Viability  was  determined  using  trypan  blue  staining  and  measured  on  a  Countess  FL  II.  Briefly,  12000 
cells  were  loaded  for  capture  onto  the  Chromium  System  using  the  v2  single  cell  reagent  kit  (10X 
Genomics).  Following  capture  and  lysis,  cDNA  was  synthesized  and  amplified  (12  cycles)  as  per 
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manufacturer's  protocol  (10X  Genomics).  The  amplified  cDNA  was  used  to  construct  an  lllumina 
sequencing  library  and  sequenced  on  a  single  lane  of  a  HiSeq  4000. 

Human  islets  from  one  nondiabetic  deceased  organ  donor  (UNOS  ID  ADIW417)  were  purchased  from 
ProdoLabs  and  processed  to  obtain  a  single  cell  suspension  as  previously  described14.  Briefly,  islets 
were  dissociated  using  Accutase  and  filtered  through  a  prewet  cell  strainer  (BD)  to  collect  single  cells. 
The  single  cell  suspension  was  prepared  and  loaded  onto  the  Chromium  System  as  described  above. 

FASTQ  generation  and  Alignments  -  lllumina  basecall  files  (*.bcl)  were  converted  to  fastqs  using 
cellranger  vl.3,  which  uses  bcl2fastq  v2. 17. 1.14.  FASTQ  files  were  then  aligned  to  hg19  genome  and 
transcriptome  using  the  cellranger  vl. 3  pipeline,  which  generates  a  gene  vs  cell  expression  matrix. 
Clustering  and  marker  gene  identification  -  Cells  with  less  than  500  total  unique  transcripts  were  removed 
prior  to  downstream  analysis.  Genes  for  clustering  were  selected  based  on  normalized  dispersion 
analysis.  Cells  were  clustered  using  Barnes  Hut  t-SNE9  with  the  1000  most  over  dispersed  genes  and 
clusters  identified  using  DBSCAN  (eps  =  5.0,  minpts=15).  Differential  gene  expression  was  computed 
using  edgeR17  and  signature  genes  defined  as  genes  upregulated  2  fold  and  FDR  <  0.01  in  all  pairwise 
comparisons. 

Datasets  and  visualization  -  Access  to  CellView  from:  https://www.jax.org/CellOmics 
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Figure  1:  CellView  enables  cell  type  identification  of  clusters  and  discovery  of  novel  cell  states  in 


PBMC  and  pancreatic  islet  datasets.  A.  CellView’s  graphical  user  interface  has  3  different  features  that 
enables  exploration  of  single  cell  RNA-seq  datasets.  B.  Upon  PBMC  data  upload,  a  3D  plot  of  cells 


clustered  in  t-SNE  space  is  displayed  in  ‘overview’.  Expression  patterns  of  marker  genes  such  as  C. 
CD79A  and  D.  CD3D  can  be  visualized  in  multiple  panels  under  the  ‘Explore’  module  assisting  in  cell 


type  identification  and  to  discover  further  heterogeneity.  E.  3D  display  of  cell  type  clusters  identified  in 
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human  pancreatic  islets.  F.  Analysis  using  the  ‘Co-expression’  module  of  CellView  with  marker  genes 
aids  in  the  identification  the  major  endocrine  cell  populations,  alpha  (cluster  2),  beta  (cluster  3),  gamma 
(cluster  5),  delta  (cluster  4)  along  with  exocrine  cell  types  like  ductal  (cluster  1),  stellate  (cluster  6),  acinar 
(cluster  7)  and  endothelial  (cluster  8)  cells.  Cluster  and  gene  specific  views,  G.  REGIA  and  H.  TPP1 
expression  in  the  ductal  cell  cluster  identifies  cells  in  multiple  states. 


Pmmm  11)4} 


Figure  2:  Investigating  gene  expression  patterns  across  and  within  clusters  with  CellView 
identifies  different  cell  type  populations.  A.  Analysis  using  the  co-expression  module  of  CellView  with 
various  immune  cell  markers  identifies  the  major  subpopulations  present  in  PBMCs.  B.  Exploring  gene 
expression  of  CD14  and  CD68  expression  identifies  a  continuum  from  classical  to  non-classical 
monocytes  that  are  also  characterized  by  difference  in  absolute  transcript  counts  (UMI).  C.  CellView 
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allows  for  identification  of  sub-clusters  by  simple  differential  gene  expression  between  groups  of  selected 
cells  using  a  square  brush  stroke  and  displays  a  sortable  and  searchable  table.  D.  Differential  expression 
of  two  visually  resolved  populations  in  the  dendritic  cell  cluster  identifies  less  abundant  CD141+  dendritic 
cells  expressing  the  CLEC9A  and  DNASE1L3  marker  genes. 
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